9. Inference about 2 Populaton Means

Types of Questions Addressed

Comparative Analysis:
- Incomes: Are incomes in Ontario comparable to those in Quebec?
- Loneliness: Who is lonelier, Gen Z or older Canadians?
- Political Orientation: Does political orientation differ by gender in Canada?
Statistical Testing: 2-sample tests for differences between populations.
- Core question: Are two population means equal or not?
- Manual calculation: Only the t-statistic calculated by hand; all other calculations through Stata.
- Setup tests and interpret results meaningfully.

Assumption Check: Ensure the assumptions for the test are met.
Choose Significance Level (𝛼): Commonly set to 0.05.
State Hypotheses:
- Null Hypothesis (H0) vs. Alternative Hypothesis (H1).
Compute Test Statistic: Calculate the t-statistic using the formula:
[ t = \frac{\bar{x}1 - \bar{x}2}{s.e.} ]
Find the p-value: Determine the p-value associated with the test statistic.
Formal Conclusion: Compare p-value with α:
- If ( p < 𝛼 ): Reject the null hypothesis.
- If ( p \geq 𝛼 ): Do not reject the null hypothesis.
Interpret Results: Provide a plain English interpretation of the conclusion.

Samples: Both samples must be Simple Random Samples (SRS) and sampled independently.
- Not suitable for matched pairs (e.g., husband-wife).
Variable Type: The compared variable must be continuous, and the grouping variable should be dichotomous (categorical).

Null Hypothesis (H0): ( 𝝁1 = 𝝁2 ) or ( 𝝁1 - 𝝁2 = 0 ) (the two population means are equal).
Alternative Hypothesis (H1): ( 𝝁1 ≠ 𝝁2 ) (the means are not equal).

Formula: [ t = \frac{\bar{x}1 - \bar{x}2}{s.e.} ]
Distribution: The test follows the t-distribution. In large samples, critical t ≈ ±1.999.
Decision Rule: If calculated t > |1.999|, p-value < 𝛼 → reject null hypothesis.

If ( p < 𝛼 ): Reject H0.
- Interpretation: The two populations do differ significantly.
If ( p \geq 𝛼 ): Do not reject H0.
- Interpretation: There is no significant difference between the groups.

Degrees of Freedom: Calculated using Satterthwaite's formula (complex, often requires software).
In practice, run tests in Stata due to sample size considerations:
- Use normal cutpoints for large samples:
  - |1.64| for 90% CI (𝛼 = 0.1)
  - |1.96| for 95% CI (𝛼 = 0.05)
  - |2.58| for 99% CI (𝛼 = 0.01)
Do not pool variances; use the 'unequal' option in Stata for valid results.

Rejecting the Null: Indicates groups are significantly different, not likely due to chance.
Failing to Reject the Null: No significant difference observed; differences are expected under H0.

The confidence interval (CI) level represents the probability that the interval contains the population parameter.
𝛼 (alpha) represents the probability that the interval does not contain the parameter: ( 𝛼 = 1 - (\text{CI level}) ).
These values are complementary.

For hypothesis tests, under the null, assume the samples come from populations with the same mean (H0 states 𝝁1 = 𝝁2).
The test assesses whether the observed mean differences are unusual compared to this assumption.

Matched pairs refer to observations that are naturally paired (e.g., pre-test and post-test).
The approach here involves calculating differences for each pair and applying a one-sample t-test on these differences:
- Null Hypothesis: 𝝍 = 0 (the mean difference is zero).

To perform a 2-sample t-test:
ttest variable, by(group_var) unequal
Where "variable" is continuous and "group_var" identifies group categories (e.g., gender).
You can also calculate using summary statistics:
ttesti n1 ̅𝑥1 s1 n2 ̅𝑥2 s2
Both commands will automatically display descriptive statistics and confidence intervals for the differences.