9. Inference about 2 Populaton Means
Types of Questions Addressed
- Comparative Analysis:
- Incomes: Are incomes in Ontario comparable to those in Quebec?
- Loneliness: Who is lonelier, Gen Z or older Canadians?
- Political Orientation: Does political orientation differ by gender in Canada?
- Statistical Testing: 2-sample tests for differences between populations.
- Core question: Are two population means equal or not?
- Manual calculation: Only the t-statistic calculated by hand; all other calculations through Stata.
- Setup tests and interpret results meaningfully.
Steps in Hypothesis Testing
- Assumption Check: Ensure the assumptions for the test are met.
- Choose Significance Level (๐ผ): Commonly set to 0.05.
- State Hypotheses:
- Null Hypothesis (H0) vs. Alternative Hypothesis (H1).
- Compute Test Statistic: Calculate the t-statistic using the formula:
[ t = \frac{\bar{x}1 - \bar{x}2}{s.e.} ] - Find the p-value: Determine the p-value associated with the test statistic.
- Formal Conclusion: Compare p-value with ฮฑ:
- If ( p < ๐ผ ): Reject the null hypothesis.
- If ( p \geq ๐ผ ): Do not reject the null hypothesis.
- Interpret Results: Provide a plain English interpretation of the conclusion.
2-Sample T-Test Assumptions
- Samples: Both samples must be Simple Random Samples (SRS) and sampled independently.
- Not suitable for matched pairs (e.g., husband-wife).
- Variable Type: The compared variable must be continuous, and the grouping variable should be dichotomous (categorical).
2-Sample T-Test
Hypotheses
- Null Hypothesis (H0): ( ๐1 = ๐2 ) or ( ๐1 - ๐2 = 0 ) (the two population means are equal).
- Alternative Hypothesis (H1): ( ๐1 โ ๐2 ) (the means are not equal).
Test Statistic Calculation
- Formula: [ t = \frac{\bar{x}1 - \bar{x}2}{s.e.} ]
- Distribution: The test follows the t-distribution. In large samples, critical t โ ยฑ1.999.
- Decision Rule: If calculated t > |1.999|, p-value < ๐ผ โ reject null hypothesis.
Conclusion and Interpretation
- If ( p < ๐ผ ): Reject H0.
- Interpretation: The two populations do differ significantly.
- If ( p \geq ๐ผ ): Do not reject H0.
- Interpretation: There is no significant difference between the groups.
P-Value Calculation and Degrees of Freedom
- Degrees of Freedom: Calculated using Satterthwaite's formula (complex, often requires software).
- In practice, run tests in Stata due to sample size considerations:
- Use normal cutpoints for large samples:
- |1.64| for 90% CI (๐ผ = 0.1)
- |1.96| for 95% CI (๐ผ = 0.05)
- |2.58| for 99% CI (๐ผ = 0.01)
- Do not pool variances; use the 'unequal' option in Stata for valid results.
Interpretation of Results
- Rejecting the Null: Indicates groups are significantly different, not likely due to chance.
- Failing to Reject the Null: No significant difference observed; differences are expected under H0.
Relationship Between Confidence Level and ๐ผ
- The confidence interval (CI) level represents the probability that the interval contains the population parameter.
- ๐ผ (alpha) represents the probability that the interval does not contain the parameter: ( ๐ผ = 1 - (\text{CI level}) ).
- These values are complementary.
Sampling Distributions and Hypothesis Tests
- For hypothesis tests, under the null, assume the samples come from populations with the same mean (H0 states ๐1 = ๐2).
- The test assesses whether the observed mean differences are unusual compared to this assumption.
- Matched pairs refer to observations that are naturally paired (e.g., pre-test and post-test).
- The approach here involves calculating differences for each pair and applying a one-sample t-test on these differences:
- Null Hypothesis: ๐ = 0 (the mean difference is zero).
Stata Commands for T-tests
- To perform a 2-sample t-test:
ttest variable, by(group_var) unequal - Where "variable" is continuous and "group_var" identifies group categories (e.g., gender).
- You can also calculate using summary statistics:
ttesti n1 ฬ
๐ฅ1 s1 n2 ฬ
๐ฅ2 s2 - Both commands will automatically display descriptive statistics and confidence intervals for the differences.