module 3 stat section B2
Overview of Chi Squared Test of Independence
Purpose: Analyze associations between two categorical variables.
Statistical Test: Chi Squared Test of Independence.
Compared to: Chi Squared Goodness of Fit Test (used for single categorical variable).
Example Study Design
Research Question: Is it better to give up smoking by going cold turkey or joining a smoking support group?
Sample Size: 320 people total.
200 participants in smoking support group.
120 participants going cold turkey.
Design Type: Quasi-experimental (non-random allocation to groups).
Potential Biases
Concern: Age as a confounding factor.
Younger individuals may be more likely to achieve cessation, if more young people attend one method over the other, results may be skewed towards that demographic.
Hypothesis Formulation
Null Hypothesis (H0): Smoking support groups do not increase the proportion of people giving up smoking.
Alternative Hypothesis (H1): Smoking support groups increase the proportion of people that give up smoking.
Type: One-sided hypothesis (only interested in increase).
Results Summary
Example data shows higher quit rate in the support group compared to the cold turkey group.
Notable Observations:
50% of support group participants quit smoking.
<50% of cold turkey participants quit smoking.
High number (48 participants) in cold turkey group showed no change in smoking habits.
Calculation of Expected Frequencies
Under null hypothesis, calculate expected outcomes for each category.
Example Calculation:
Total who gave up smoking: 184.
Expected in smoking support group (n=200): 115.
Expected in cold turkey group (n=120): 69.
Repeat for other categories: smoking less and no change.
Test Statistic
Calculation method:
For each cell: (Observed - Expected)^2 / Expected
Sum all results for total chi squared value (χ²).
Example Result: Overall chi squared value = 34.3.
Interpretation of Test Statistic
High chi squared value indicates significant discrepancy between observed and expected results, suggesting relationship exists.
Check significance using the degrees of freedom (df).
df for a 2x3 table: (rows-1)(columns-1) = 2.
Typical threshold for df=2: 5.9; our chi squared = 34.3 implies significant association.
P-value Calculation
Determine the p-value for χ² = 34.3 with df=2.
Result: p-value < 0.001; highly significant.
Conclusion: Reject null hypothesis; smoking support groups likely positively affect cessation rates.
Assumptions of the Chi Squared Test
Observational units (participants) are independent.
Expected counts in each cell should be ≥ 5.
If counts fall below this threshold, consider combining categories.
Visual Representation
Percentages can effectively display results.
Smoking Support Group: 64% quit.
Cold Turkey: 47% quit.
Graphs (e.g. side-by-side bar chart) visualize differences in quit rates effectively.
Future Topics
Continuing analysis of association measures between categorical variables.
Explore more techniques for summarizing differences in groups.