Chi-Square Test: Contingency Tables
Chi-Square Test: Contingency Tables
- Used to determine whether there is an association between two categorical variables.
- Example: Personality (Introvert, Extrovert) and Colour Preference (Red, Yellow, Green, Blue).
Application
- The Chi-Square test may be used to investigate the association between personality and colour preference.
- Note: The Chi-Square test may be used for ordinal data, but the test will treat the ordinal data as categorical. In R, it is possible to modify the Chi-Square test using the linear-by-linear option to ensure the order is taken into account.
Hypotheses
- Null Hypothesis (H_0): There is no association between the variables.
- Alternative Hypothesis (H_1): There is an association between the variables.
- The method is based on comparing observed frequencies with the frequencies you would expect to get by chance.
Test Statistic
For a table with r rows and c columns, the Chi-Square statistic is calculated as:
x^2 = \sum{i=1}^{r} \sum{j=1}^{c} \frac{(O{ij} - E{ij})^2}{E_{ij}}
where:- O_{ij} represents the observed frequency.
- E_{ij} represents the expected frequency.
x^2 approximately follows a \chi^2 distribution with (r-1)(c-1) degrees of freedom.
Expected Frequencies
- The expected frequency E{ij} is calculated as:
E{ij} = \frac{Y{i.} \times Y{.j}}{n}
where:
- Y_{i.} gives the row totals.
- Y_{.j} gives the column totals.
- n is the total number of observations.
Evaluation
- We evaluate x^2 using tables of \chi^2 distribution with (r-1)(c-1) degrees of freedom.
Yates' Continuity Correction
- For 2x2 frequency tables (where degrees of freedom, df = 1 = (r-1)(c-1)), the Chi-Square test produces overly significant results (rejecting H_0 when it is true).
- In such cases, we apply Yates' Continuity Correction to the test statistic:
x^2{\text{corrected}} = \sum{i=1}^{r} \sum{j=1}^{c} \frac{(|O{ij} - E{ij}| - 0.5)^2}{E{ij}} - Yates' continuity correction is also applied to the x^2 goodness-of-fit test when df = K-1 = 1.
Effect Size: Strength of Association
Chi-Square tests do not tell us how strong an association is; therefore, consider effect size measures.
Phi Coefficient (\phi):
- Used for 2x2 tables only.
\phi = \sqrt{\frac{x^2}{n}} - Guidelines:
- Small: 0.1
- Medium: 0.3
- Large: 0.5
Cramer's V:
- Can be used with 2 categorical variables when each variable has 2 or more categories.
V = \sqrt{\frac{x^2}{n \times dfv}} where dfv = \min(c-1, r-1)
0 \leq V \leq 1. When V = 0, there is no association between the variables. V = 1 only when the variables are equal to each other.
- Guidelines:
| dfv | Small | Medium | Large | ||
|---|---|---|---|---|---|
| 1 (2x2) | 0.1 | 0.3 | 0.5 | ||
| 2 | 0.07 | 0.21 | 0.35 | ||
| 3 | 0.06 | 0.17 | 0.29 | ||
| 4 | 0.05 | 0.15 | 0.25 | ||
| 5 | 0.05 | 0.13 | 0.22 | ||
Odds Ratio: | |||||
Consider the following table: | |||||
| Outcome A | |||||
| Outcome B | |||||
| Totals | |||||
| Group 1 | |||||
| A_1 | |||||
| B_1 | |||||
| N_1 | |||||
| Group 2 | |||||
| A_2 | |||||
| B_2 | |||||
| N_2 | |||||
| Totals | |||||
| N_A | |||||
| N_B | |||||
The odds ratio (OR) is given by:
Evaluation of Odds Ratio
- OR = 1: Belonging to Group 1 has not affected the odds of Outcome A.
- OR > 1: Belonging to Group 1 has increased the odds of Outcome A.
- OR < 1: Belonging to Group 1 has decreased the odds of Outcome A.
Post Hoc Tests
- If x^2 is significant ⇒ association between the variables, but it does not provide any specific information about the association.
- In R, we can use post hoc tests to investigate further. We will use the standardised residuals approach.
The Likelihood Ratio
- An alternative to the x^2 test uses a Model based on Maximum-likelihood theory.
L{x^2} = 2 \left{ \sum{i=1}^{r} \sum{j=1}^{c} y{ij} \ln \left( \frac{y{ij}}{E{ij}} \right) \right}
Evaluation of Likelihood Ratio
- Evaluate L_{x^2} in the same way as x^2.
- Example: Following Example 3.6, find L_{x^2} = 0.42.
- Again, \chi^2{0.05, 1} = 3.841 > 0.42 = L{x^2}. Do not reject \H_0 ⇒ there does not appear to be an association between Education Levels and Department.