Examining relationships between two categorical variables:
Do Canadian men and women differ in educational attainment?
Are gender and educational attainment independent?
Do immigrants differ from Canadian-born adults in social media use?
Are immigrant status and social media use independent?
Do Gen Z vs older Canadians differ in the time they spend following the news?
Is age/generation and time consuming the news independent?
Week 10 Key Points
Conduct chi-squared hypothesis tests for pairs of categorical variables:
Perform tests by hand and using Stata.
Interpret results meaningfully.
Review descriptive statistics (crosstabs) for pairs of categorical variables, essential for Assignment 2.
Steps for All Hypothesis Tests
Check if assumptions for the test are met.
Choose the significance level (𝛼), typically set at 0.05.
State the hypotheses:
Null hypothesis (H0): distributions of the variables across populations are the same (variables are independent).
Alternative hypothesis (Ha): distributions are not the same (variables are not independent).
Compute the chi-squared statistic using the formula:
[ \chi^2 = \sum \frac{(O - E)^2}{E} ]
Where O = observed frequency, E = expected frequency.
Find the associated p-value and compare it to 𝛼:
If p < 𝛼, reject H0.
If p ≥ 𝛼, do not reject H0.
Interpret results in plain English.
Assumptions for Chi-Squared Test
The following must be satisfied:
Simple Random Sample (SRS).
Expected count in each cell must be at least 5 (E ≥ 5).
Under these conditions, the sampling distribution under the null hypothesis follows roughly a chi-squared distribution with degrees of freedom (d.f.) = (r-1)(c-1).
Understanding Hypotheses
H0: The distributions across tested populations are the same (independent).
Ha: The distributions are not the same (not independent).