Inference for Categorical Data 293

a b c d e
5 4 = 20 5 3 = 15 4 4 = 16 5+4 2=7 4 3 = 12

8 Which of the following statements is (are) correct I A condition for using a 2 test is that most expected values must be at least 5 and that all must be at least 1 II A 2 test for goodness of fit tests the degree to which a categorical variable has a specific distribution III Expected cell counts are computed in the same way for goodness of fit tests and tests of independence a I only b II only c I and II only d II and III only e I, II and III

Free Response
1 An AP Statistics student noted that the probability distribution for a binomial random variable with n = 4 and p = 03 is approximately given by:

0 1 2 3 4

0240 0412 0265 0076 0008

(Note: p = 1001 rather than 1 due to rounding)

The student decides to test the randBin function on her TI83/84 by putting 500 values into a list using this function (randBin(4,03,500) L1) and counting the number of each outcome (Can you think of an efficient way to count each outcome ) She obtained n 0 1 2 3 4 Observed 110 190 160 36 4

Do these data provide evidence that the randBin function on the calculator is correctly generating values from this distribution 294 U Step 4 Review the Knowledge You Need to Score High
Calculator Tip: It s a bit of a digression, but if you actually wanted to do the experiment in question 1, you would need to have an efficient way of counting the number of each outcome You certainly don t want to simply scroll through all 500 entries and tally each one Even sorting them first and then counting would be tedious (more so if n were bigger than 4) The easiest way is to draw a histogram of the data and then TRACE to get the totals Once you have your 500 values from randBin in L1, go to STAT PLOTS and set up a histogram for L1 Choose a WINDOW something like [ 05,45,1, 1,300,1,1] Be sure that Xscl is set to 1 You may need to adjust the Ymax from 300 to get a nice picture on your screen Then simply TRACE across the bars of the histogram and read the value of n for each outcome off of the screen The reason for having x go from 05 to 45 is so that the (integer) outcomes will be in the middle of each bar of the histogram A chisquare test for the homogeneity of proportions is conducted on three populations and one categorical variable that has four values Computation of the chisquare statistic yields X 2 = 172 Is this finding significant at the 001 level of significance Which of the following best describes the difference between a test for independence and a test for homogeneity of proportions Discuss the correctness of each answer a There is no difference because they both produce the same value of the chisquare test statistic b A test for independence has one population and two categorical variables, whereas a test for homogeneity of proportions has more than one population and only one categorical variable c A test for homogeneity of proportions has one population and two categorical variables, whereas a test for independence has more than one population and only one categorical variable d A test for independence uses count data when calculating chisquare and a test for homogeneity uses percentages or proportions when calculating chisquare Compute the expected value for the cell that contains the frog You are given the marginal distribution

