Unit - 8 Inference for Categorical Data: Chi-Square

## Chi-Square Test for Goodness-of-Fit

A perfect fit cannot be expected, and so we must look at discrepancies and make judgments as to the

*Goodness-of-fit*.There is the null hypothesis of a good fit, that is, the hypothesis that a given theoretical distribution correctly describes the situation, problem, or activity under consideration.

The sum of these weighted differences or discrepancies is called the ** chi-square statistic** and is denoted as

*χ*2 (

*χ*is the lowercase Greek letter chi):

The smaller the resulting

*χ*2-value, the better the fit.The

**-value** is the probability of obtaining a*P**χ*2-value as extreme as (or as more extreme than) the one obtained if the null hypothesis is**assumed true**.If the

*χ*2-value is large enough, that is, if the*P*-value is small enough, we say there is sufficient evidence to reject the null hypothesis and to claim that the fit is poor.

To decide how large a calculated *χ*2-value must be to be significant, that is, to choose a critical value, we must understand how *χ*2-values are distributed.

A

*χ*2-distribution has only nonnegative values, is not symmetric, and is always skewed to the right.There are distinct

*χ*2-distributions, each with an associated number of degrees of freedom (*df*).The larger the

*df*value, the less pronounced is the skew, and the closer the*χ*2-distribution is to a normal distribution.

For inference about the distribution of a single categorical variable, such as a goodness-of-fit test, we will use a chi-square distribution with degrees of freedom,

.df = number of categories- 1

#### ➥ **Example 8.1**

A large city is divided into four distinct socioeconomic regions, one where the upper class lives, one for the middle class, one for the lower class, and one mixed-class region. Area percentages of the regions are 12%, 38%, 32%, and 18%, respectively. In a random sample of 55 liquor stores in the city, the numbers from each region are 4, 16, 26, and 9, respectively. The following shows how to determine if there is statistical evidence that region makes a difference with regard to numbers of liquor stores.

If liquor stores in the city were distributed among the four regions in the same proportions as the areas of those regions, what number (out of 55) would be expected in each region?

Are the sample numbers 4, 16, 26, and 9 significantly different from the expected values 6.6, 20.9, 17.6, and 9.9 to indicate that region makes a difference with regard to numbers of liquor stores?

**Solution:**

(0.12)(55) = 6.6, (0.38)(55) = 20.9, (0.32)(55) = 17.6, and (0.18)(55) = 9.9, and we have:

First, state the hypotheses.

*H*0:Liquor stores are distributed over the four city regions in the same proportions as the areas of those regions, that is, in the percentages 12, 38, 32, and 18, respectively.*H*a:Liquor stores are not distributed over the four city regions in the same proportions as the areas of those regions (at least one proportion is not as specified in the null hypothesis).

Second, name the procedure and check the conditions:

*Procedure:* A chi-square test for Goodness-of-fit.

*Checks:*

*Randomization*: We are given that the sample is random.We note that the expected values (6.6, 20.9, 17.6, 9.9) are all > 5.

We assume the sample size, 55, is less than 10% of the number of all liquor stores in the large city.

Third, calculate the **Χ²** statistic:

The *P*-value is *P* = *P*(*χ*2 > 6.264) = 0.099. [If *n* is the number of classes, *df* = *n* − 1 = 3, and *χ*2cdf(6.264, 1000, 3) = 0.099.] We also note that putting the observed and expected numbers in Lists, calculator software (such as *χ*2GOF-Test on the TI-84 or on the Casio Prizm) quickly gives *χ*2 = 6.262 and *P* = 0.099.

Fourth, give a conclusion in context with linkage to the *P*-value:

With this large a *P*-value, 0.099 > 0.05, there is not sufficient evidence to reject *H*0. That is, there is not convincing evidence that liquor stores in this city are distributed over the four city regions (upper, middle, lower, and mixed class) in different proportions to the areas of those regions.

## Chi-Square Test for Independence

Test of Independence - There is a class of real-world problems in which we want to determine if there is a significant association between two categorical variables.

We classify our single sample observations in two ways and then ask whether the two ways are independent of each other.

For example, we might consider several age groups and within each group ask how many employees show various levels of job satisfaction. The **null hypothesis** is that __age and job satisfaction are__ *independen**t*, that is, that the proportion of employees expressing a given level of job satisfaction is the same no matter which age group is considered.

When testing for independence,

where *df* is the number of degrees of freedom, *r* is the number of rows, and *c* is the number of columns.

A

**point worth noting**is that even if there is sufficient evidence to reject the null hypothesis of independence, we cannot necessarily claim any__direct__*causal*__relationship__.In other words, although we can make a statement about some link or relationship between two variables,

__we are__*not*__justified in claiming that one causes the other__.

#### ➥ **Example 8.2**

A growing number of states have legalized marijuana for medical or recreational purposes. In a nationwide telephone poll of 1000 randomly selected adults representing Democrats, Republicans, and Independents, respondents were asked two questions: their party affiliation and if they supported the legalization of marijuana. The answers, cross-classified by party affiliation, are given in the following two-way table (also called a *contingency* *table*).

Test the null hypothesis that support for legalizing marijuana is independent of party affiliation. Use a 5% significance level.

**Solution:**

*Hypothesis H*0:Party affiliation and support for legalizing marijuana are independent.*H*a:Party affiliation and support for legalizing marijuana are not independent.

*Mechanics:* Putting the observed data into a "Matrix," calculator software (such as *χ*2-Test on the TI-84, Casio Prizm, or HP Prime) gives *χ*2 = 94.5 and *P* = 0.000, and stores the expected values in a second matrix:

*Check conditions:* We are given a random sample, *n* = 1000 is less than 10% of all adults, and we note that all expected cells are > 5.

*Conclusion with linkage to the P-value:* With this small of a *P*-value, 0.000 < 0.05, there is sufficient evidence to reject *H*0; that is, among all adults there is sufficient evidence of a relationship between party affiliation and support for legalizing marijuana.

## Chi-Square Test for Homogeneity

Chi-square procedures can also be used with a single variable to compare samples from two or more populations.

Conditions to check include that the samples be

*simple random samples*, that they be taken*independently*of each other, that the original populations be large compared to the sample sizes, and that the expected values for all cells be at least 5. The contingency table used has a row for each sample. The resulting procedure is called a.*chi-square test for homogeneity*

#### ➥ **Example 8.3**

In a large city, a group of AP Statistics students work together on a project to determine which group of school employees has the greatest proportion who are satisfied with their jobs. In independent simple random samples of 100 teachers, 60 administrators, 45 custodians, and 55 secretaries, the numbers satisfied with their jobs were found to be 82, 38, 34, and 36, respectively. Is there evidence that the proportion of employees satisfied with their jobs is different in different school system job categories?

**Solution:**

*Hypotheses:*

*H*0:The proportion of employees satisfied with their jobs is the same across the various school system job categories.*H*a:At least two of the job categories differ in the proportion of employees satisfied with their jobs.

*Procedure:* A chi-square test for homogeneity.

*Checks:*

We are given independent simple random samples (a stratified random sample).

The expected cells (calculated below) are all greater than 5.

We assume that the four samples are each less than 10% of their respective populations in the large city.

*Mechanics:* using Matrix and *χ*2-Test on a calculator, we find the expected cells, the test statistic *χ*2, and the *P*-value.

The observed counts are as follows:

Putting the observed data into a Matrix, calculator software gives *χ*2 = 8.707 and *P* = 0.0335 and stores the expected values in a second matrix:

Conclusion in context with linkage to the *P*-value:

With this small of a *P*-value, 0.0335 < 0.05, there is sufficient evidence to reject *H*0; that is, there is convincing evidence that the true proportion of employees satisfied with their jobs is not the same across all the school system job categories.

# Unit - 8 Inference for Categorical Data: Chi-Square

## Chi-Square Test for Goodness-of-Fit

A perfect fit cannot be expected, and so we must look at discrepancies and make judgments as to the

*Goodness-of-fit*.There is the null hypothesis of a good fit, that is, the hypothesis that a given theoretical distribution correctly describes the situation, problem, or activity under consideration.

The sum of these weighted differences or discrepancies is called the ** chi-square statistic** and is denoted as

*χ*2 (

*χ*is the lowercase Greek letter chi):

The smaller the resulting

*χ*2-value, the better the fit.The

**-value** is the probability of obtaining a*P**χ*2-value as extreme as (or as more extreme than) the one obtained if the null hypothesis is**assumed true**.If the

*χ*2-value is large enough, that is, if the*P*-value is small enough, we say there is sufficient evidence to reject the null hypothesis and to claim that the fit is poor.

To decide how large a calculated *χ*2-value must be to be significant, that is, to choose a critical value, we must understand how *χ*2-values are distributed.

A

*χ*2-distribution has only nonnegative values, is not symmetric, and is always skewed to the right.There are distinct

*χ*2-distributions, each with an associated number of degrees of freedom (*df*).The larger the

*df*value, the less pronounced is the skew, and the closer the*χ*2-distribution is to a normal distribution.

For inference about the distribution of a single categorical variable, such as a goodness-of-fit test, we will use a chi-square distribution with degrees of freedom,

.df = number of categories- 1

#### ➥ **Example 8.1**

A large city is divided into four distinct socioeconomic regions, one where the upper class lives, one for the middle class, one for the lower class, and one mixed-class region. Area percentages of the regions are 12%, 38%, 32%, and 18%, respectively. In a random sample of 55 liquor stores in the city, the numbers from each region are 4, 16, 26, and 9, respectively. The following shows how to determine if there is statistical evidence that region makes a difference with regard to numbers of liquor stores.

If liquor stores in the city were distributed among the four regions in the same proportions as the areas of those regions, what number (out of 55) would be expected in each region?

Are the sample numbers 4, 16, 26, and 9 significantly different from the expected values 6.6, 20.9, 17.6, and 9.9 to indicate that region makes a difference with regard to numbers of liquor stores?

**Solution:**

(0.12)(55) = 6.6, (0.38)(55) = 20.9, (0.32)(55) = 17.6, and (0.18)(55) = 9.9, and we have:

First, state the hypotheses.

*H*0:Liquor stores are distributed over the four city regions in the same proportions as the areas of those regions, that is, in the percentages 12, 38, 32, and 18, respectively.*H*a:Liquor stores are not distributed over the four city regions in the same proportions as the areas of those regions (at least one proportion is not as specified in the null hypothesis).

Second, name the procedure and check the conditions:

*Procedure:* A chi-square test for Goodness-of-fit.

*Checks:*

*Randomization*: We are given that the sample is random.We note that the expected values (6.6, 20.9, 17.6, 9.9) are all > 5.

We assume the sample size, 55, is less than 10% of the number of all liquor stores in the large city.

Third, calculate the **Χ²** statistic:

The *P*-value is *P* = *P*(*χ*2 > 6.264) = 0.099. [If *n* is the number of classes, *df* = *n* − 1 = 3, and *χ*2cdf(6.264, 1000, 3) = 0.099.] We also note that putting the observed and expected numbers in Lists, calculator software (such as *χ*2GOF-Test on the TI-84 or on the Casio Prizm) quickly gives *χ*2 = 6.262 and *P* = 0.099.

Fourth, give a conclusion in context with linkage to the *P*-value:

With this large a *P*-value, 0.099 > 0.05, there is not sufficient evidence to reject *H*0. That is, there is not convincing evidence that liquor stores in this city are distributed over the four city regions (upper, middle, lower, and mixed class) in different proportions to the areas of those regions.

## Chi-Square Test for Independence

Test of Independence - There is a class of real-world problems in which we want to determine if there is a significant association between two categorical variables.

We classify our single sample observations in two ways and then ask whether the two ways are independent of each other.

For example, we might consider several age groups and within each group ask how many employees show various levels of job satisfaction. The **null hypothesis** is that __age and job satisfaction are__ *independen**t*, that is, that the proportion of employees expressing a given level of job satisfaction is the same no matter which age group is considered.

When testing for independence,

where *df* is the number of degrees of freedom, *r* is the number of rows, and *c* is the number of columns.

A

**point worth noting**is that even if there is sufficient evidence to reject the null hypothesis of independence, we cannot necessarily claim any__direct__*causal*__relationship__.In other words, although we can make a statement about some link or relationship between two variables,

__we are__*not*__justified in claiming that one causes the other__.

#### ➥ **Example 8.2**

A growing number of states have legalized marijuana for medical or recreational purposes. In a nationwide telephone poll of 1000 randomly selected adults representing Democrats, Republicans, and Independents, respondents were asked two questions: their party affiliation and if they supported the legalization of marijuana. The answers, cross-classified by party affiliation, are given in the following two-way table (also called a *contingency* *table*).

Test the null hypothesis that support for legalizing marijuana is independent of party affiliation. Use a 5% significance level.

**Solution:**

*Hypothesis H*0:Party affiliation and support for legalizing marijuana are independent.*H*a:Party affiliation and support for legalizing marijuana are not independent.

*Mechanics:* Putting the observed data into a "Matrix," calculator software (such as *χ*2-Test on the TI-84, Casio Prizm, or HP Prime) gives *χ*2 = 94.5 and *P* = 0.000, and stores the expected values in a second matrix:

*Check conditions:* We are given a random sample, *n* = 1000 is less than 10% of all adults, and we note that all expected cells are > 5.

*Conclusion with linkage to the P-value:* With this small of a *P*-value, 0.000 < 0.05, there is sufficient evidence to reject *H*0; that is, among all adults there is sufficient evidence of a relationship between party affiliation and support for legalizing marijuana.

## Chi-Square Test for Homogeneity

Chi-square procedures can also be used with a single variable to compare samples from two or more populations.

Conditions to check include that the samples be

*simple random samples*, that they be taken*independently*of each other, that the original populations be large compared to the sample sizes, and that the expected values for all cells be at least 5. The contingency table used has a row for each sample. The resulting procedure is called a.*chi-square test for homogeneity*

#### ➥ **Example 8.3**

In a large city, a group of AP Statistics students work together on a project to determine which group of school employees has the greatest proportion who are satisfied with their jobs. In independent simple random samples of 100 teachers, 60 administrators, 45 custodians, and 55 secretaries, the numbers satisfied with their jobs were found to be 82, 38, 34, and 36, respectively. Is there evidence that the proportion of employees satisfied with their jobs is different in different school system job categories?

**Solution:**

*Hypotheses:*

*H*0:The proportion of employees satisfied with their jobs is the same across the various school system job categories.*H*a:At least two of the job categories differ in the proportion of employees satisfied with their jobs.

*Procedure:* A chi-square test for homogeneity.

*Checks:*

We are given independent simple random samples (a stratified random sample).

The expected cells (calculated below) are all greater than 5.

We assume that the four samples are each less than 10% of their respective populations in the large city.

*Mechanics:* using Matrix and *χ*2-Test on a calculator, we find the expected cells, the test statistic *χ*2, and the *P*-value.

The observed counts are as follows:

Putting the observed data into a Matrix, calculator software gives *χ*2 = 8.707 and *P* = 0.0335 and stores the expected values in a second matrix:

Conclusion in context with linkage to the *P*-value:

With this small of a *P*-value, 0.0335 < 0.05, there is sufficient evidence to reject *H*0; that is, there is convincing evidence that the true proportion of employees satisfied with their jobs is not the same across all the school system job categories.