VG

Chi Square Hypothesis Testing

Chapter 10: Hypothesis Testing IV (Chi Square)

Chapter Outline

  • Introduction
  • Bivariate Table
  • The Logic of Chi Square (\chi^2) test
  • Chi Square Test for Independence
    • The Five-Step Model
    • Computation of Chi Square (\chi^2)

Bivariate Table

  • Columns represent values of the independent variable.
  • Rows represent values of the dependent variable.
  • Cells represent the intersections of columns and rows.
  • Each cell reports the number of times each combination of values occurred.

Bivariate Table Structure

Column 1Column 2
Row 1cell acell bRow Marginal 1
Row 2cell ccell dRow Marginal 2
N (Total)
Column Marginal 1Column Marginal 2

Basic Logic of \chi^2 Test

  • Chi-Square is a test of significance based on a bivariate table.
    • Most often, both variables are categorical (i.e., nominal or ordinal).
  • We are looking for significant differences between the observed frequencies (fo) and the expected frequencies (fe) given two variables are independent.

Example of \chi^2 Test

  • Is there any statistically significant relationship between gender and party identification?
  • Data is based on the 1991 General Social Survey.

Example Data: Party Identification by Gender

GenderParty Identification
FemalesMalestotal
Democrat279165444
Independent7347120
Republican225191416
total577403N=980

Step 1: Assumptions and Test Requirements

  • An independent random sample.
  • Both variables are categorical (as is typical for the chi-square test).
  • No assumption is made about the shape of the population distribution (the chi-square test is a non-parametric test).

Step 2: State the Null and Research Hypotheses (H0 and H1)

  • H_0: Party identification and gender are independent. (Gender has no effect on party identification).
  • H_1: Party identification and gender are dependent. (Gender has some effect on party identification).

Step 3: Select Sampling Distribution and Establish the Critical Region

  • Sampling Distribution = \chi^2
  • Use the table in Appendix C (“Distribution of Chi Square”) to find \chi^2 (critical):
    • df = (r-1)(c-1) = (3-1)(2-1) = 2
    • If we set \alpha = 0.05, \chi^2 (critical) = 5.991

Step 4: Compute the Test Statistic (\chi^2)

\chi^2 (obtained) = \sum \frac{(fo - fe)^2}{f_e}

  • Where:

    • f_o represents the observed frequencies.
    • f_e represents the expected frequencies.
    • The expected frequencies (f_e) are computed as follows:

    f_e = \frac{\text{(RowMarginal)(ColumnMarginal)}}{N}

Step 4: Compute the Test Statistic (\chi^2) - Expected Frequencies

  • Compute expected frequencies (f_e):
    • (444*577)/980 = 261.4
    • (444*403)/980 = 182.6
    • (120*577)/980 = 70.7
    • (120*403)/980 = 49.3
    • (416*577)/980 = 244.9
    • (416*403)/980 = 171.1

Step 4: Compute the Test Statistic (\chi^2): Party Identification by Gender (with Expected Frequencies)

GenderParty Identification
FemalesMalestotal
Democrat279 (261.4)165 (182.6)444
Independent73 (70.7)47 (49.3)120
Republican225 (244.9)191 (171.1)416
total577403N=980
(Expected frequencies are in the parentheses)

Step 4: Compute the Test Statistic (\chi^2) - Calculation Table

f_of_efo - fe(fo - fe)^2(fo - fe)^2 / f_e
279261.417.6309.81.19
165182.6-17.6309.81.70
7370.72.35.290.07
4749.3-2.35.290.11
225244.9-19.9396.01.62
191171.119.9396.02.31
9809800 \chi^2(obtained) = 7.0

Step 5: Interpret Results and Make a Decision

  • \chi^2 (critical) = 5.991
  • \chi^2 (obtained) = 7.0
  • The test statistic is in the critical region. Therefore, we reject the H_0.
  • There is a significant relationship between gender and party identification (or gender and party identification are dependent).

Interpreting Chi-Square Test

  • The chi-square test tells us only if the variables are independent or not.
  • It does not tell us the pattern or the nature of the relationship.
  • To investigate the pattern, use observed frequencies to compute percentages within each column and compare across the columns.

Interpreting Chi-Square Test: Party Identification by Gender (Column Percentages)

GenderParty Identification
FemalesMalestotal
Democrat279 (48.4%)165 (40.9%)444
Independent73 (12.7%)47 (11.7%)120
Republican225 (39.0%)191 (47.4%)416
total577403N=980
(use observed frequency; column percentages are in the parentheses)

Other Limitations of Chi-Square Test

  • Similar to other types of hypothesis testing, chi-square is sensitive to sample size.
    • As N increases, obtained chi-square increases.
    • With large samples, trivial relationships may be significant.
  • Remember: significance is not the same thing as importance.

Lab Exercise & Homework

  • Lab Exercise 7 (ungraded)
    • 10.11 (p. 285)
    • Conduct the chi-square test and also compute the column percentage
  • HW8 (graded)
    • 10.12 (p. 286)
      • Conduct the chi-square test and also compute the column percentage
      • Formula 10.4 in “The Limitations of the Chi Square Test” (p. 280) is not covered.