Hypothesis Testing and Chi-Square Tests

Chapter 1: Introduction

  • When comparing variables across groups, we examine whether the proportions of a variable are similar across two or more groups.
  • The null hypothesis (H_0) assumes the proportions are similar across all groups.
  • The alternative hypothesis (H_a) states that at least one of the proportions is different.
  • Rejecting H_0 implies that one or more of the proportions are not the same.

Chapter 2: Significant Significant Proof

  • If you "fail to reject" the null hypothesis, it means you don't have enough statistical proof to say that any of the proportions are wrong.
  • Test of Independence:
    • The null hypothesis (H_0) is that two variables are independent.
    • The alternative hypothesis (H_a) is that the two variables are not independent (they are associated).
    • When conducting a Chi-square test, you calculate a p-value.
    • Example: If the p-value is 0.1, and the significance level is lower than that (e.g. 0.05),
      you fail to reject the null hypothesis (H_0).
    • Failing to reject H_0 means there's not enough statistical evidence to prove the variables are not independent.

Chapter 3: Degrees Of Freedom

  • To calculate the Chi-square statistic (\chi^2), you need observed and expected values.
  • Degrees of freedom depend on the specific test being conducted.

Chapter 4: Conclusion

  • For the Goodness of Fit test:
    • Degrees of freedom = (number of categories - 1).
    • Example: If a variable (e.g., color) has five categories, the degrees of freedom is 4.
  • For Homogeneity and Test of Independence (two-way table):
    • Degrees of freedom = (number of categories in variable 1 - 1) * (number of categories in variable 2 - 1).
    • Example: If one variable has two categories (yes/no) and another variable has three categories, the degrees of freedom = (2-1) * (3-1) = 1 * 2 = 2.
  • For multiple groups (two or three) and one variable:
    • Degrees of freedom = (number of categories - 1).