module 3 stat section B2

Overview of Chi Squared Test of Independence

  • Purpose: Analyze associations between two categorical variables.

  • Statistical Test: Chi Squared Test of Independence.

  • Compared to: Chi Squared Goodness of Fit Test (used for single categorical variable).

Example Study Design

  • Research Question: Is it better to give up smoking by going cold turkey or joining a smoking support group?

  • Sample Size: 320 people total.

    • 200 participants in smoking support group.

    • 120 participants going cold turkey.

  • Design Type: Quasi-experimental (non-random allocation to groups).

Potential Biases

  • Concern: Age as a confounding factor.

    • Younger individuals may be more likely to achieve cessation, if more young people attend one method over the other, results may be skewed towards that demographic.

Hypothesis Formulation

  • Null Hypothesis (H0): Smoking support groups do not increase the proportion of people giving up smoking.

  • Alternative Hypothesis (H1): Smoking support groups increase the proportion of people that give up smoking.

    • Type: One-sided hypothesis (only interested in increase).

Results Summary

  • Example data shows higher quit rate in the support group compared to the cold turkey group.

  • Notable Observations:

    • 50% of support group participants quit smoking.

    • <50% of cold turkey participants quit smoking.

    • High number (48 participants) in cold turkey group showed no change in smoking habits.

Calculation of Expected Frequencies

  • Under null hypothesis, calculate expected outcomes for each category.

  • Example Calculation:

    • Total who gave up smoking: 184.

    • Expected in smoking support group (n=200): 115.

    • Expected in cold turkey group (n=120): 69.

  • Repeat for other categories: smoking less and no change.

Test Statistic

  • Calculation method:

    • For each cell: (Observed - Expected)^2 / Expected

    • Sum all results for total chi squared value (χ²).

  • Example Result: Overall chi squared value = 34.3.

Interpretation of Test Statistic

  • High chi squared value indicates significant discrepancy between observed and expected results, suggesting relationship exists.

  • Check significance using the degrees of freedom (df).

    • df for a 2x3 table: (rows-1)(columns-1) = 2.

  • Typical threshold for df=2: 5.9; our chi squared = 34.3 implies significant association.

P-value Calculation

  • Determine the p-value for χ² = 34.3 with df=2.

  • Result: p-value < 0.001; highly significant.

  • Conclusion: Reject null hypothesis; smoking support groups likely positively affect cessation rates.

Assumptions of the Chi Squared Test

  1. Observational units (participants) are independent.

  2. Expected counts in each cell should be ≥ 5.

    • If counts fall below this threshold, consider combining categories.

Visual Representation

  • Percentages can effectively display results.

    • Smoking Support Group: 64% quit.

    • Cold Turkey: 47% quit.

  • Graphs (e.g. side-by-side bar chart) visualize differences in quit rates effectively.

Future Topics

  • Continuing analysis of association measures between categorical variables.

  • Explore more techniques for summarizing differences in groups.