Notes on Hypothesis Tests for 2 Proportions
Comparing Two Proportions
- Estimating Difference Between Two Proportions:
- Example: Comparing admission rates at ages 17 and 18.
- Statistic used: Difference in sample proportions
- Population proportions:
- Population 1: p1, size n1, sample proportion ar{p}_1
- Population 2: p2, size n2, sample proportion ar{p}_2
Sampling Distribution of Proportions
- For independent samples of sizes n1 and n2 from populations with parameters p1 and p2:
- Mean:
- ext{Mean of the sampling distribution } ar{p} = p
- ext{Mean of } (ar{p}1 - ar{p}2) = p1 - p2
- Standard Deviation:
- ext{Std. deviation of } ar{p} = rac{p(1 - p)}{n}
- ext{Std. deviation of } (ar{p}1 - ar{p}2) = \sqrt{\frac{p1(1 - p1)}{n1} + \frac{p2(1 - p2)}{n2}}
- Normality Conditions:
- n p ext{ and } n(1 - p) ext{ should be } ext{≥} 10 for both samples
Assumptions and Conditions
- Independence Observations:
- Randomization Condition:
- Data drawn independently and randomly from a homogeneous population.
- 10% Condition:
- Sample should not exceed 10% of population when sampled without replacement.
- Independent Groups:
- Two groups must be independent.
- Sample Size Condition:
- Each group must be sufficiently large.
- Success/Failure Condition:
- n1 p1 ext{ and } n1(1 - p1) ext{ ≥ } 10
- n2 p2 ext{ and } n2(1 - p2) ext{ ≥ } 10
Confidence Interval for 2 Population Proportions
- Formula:
- ar{p}1 - ar{p}2 ext{ ± } z^* \sqrt{\frac{\bar{p}1(1 - \bar{p}1)}{n1} + \frac{\bar{p}2(1 - \bar{p}2)}{n2}}
- Example:
- Smokers (n1=150): 95 with prominent wrinkles, ar{p}_1 = 0.63
- Nonsmokers (n2=250): 105 with prominent wrinkles, ar{p}_2=0.42
- 95% CI for smokers:
- 0.63 ± 1.96 × 0.0394 = (0.55, 0.71)
- 95% CI for nonsmokers:
- 0.42 ± 1.96 × 0.0312 = (0.36, 0.48)
- Check for overlap in intervals: indicates proportion differences.
Two-Proportion z Test
- Hypotheses:
- Two-tailed: H0: p1 - p2 = p0 vs Ha: p1 - p2
eq p0
- Upper-tailed: H0: p1 - p2 = p0 vs Ha: p1 - p2 > p0
- Lower-tailed: H0: p1 - p2 = p0 vs Ha: p1 - p2 < p0
- Assumption & Conditions:
- Random samples, independent observations, large sizes
- Normal Conditions: n1 p1, n1(1 - p1), n2 p2, n2(1 - p2) ext{ ≥ } 10
- Test Statistic:
- z0 = \frac{\bar{p}1 - \bar{p}2 - (p1 - p2)}{\sqrt{\frac{p1(1 - p1)}{n1} + \frac{p2(1 - p2)}{n_2}}}
- If null is true, pool the proportions:
- \bar{p}{pooled} = \frac{n1 \bar{p}1 + n2 \bar{p}2}{n1 + n_2}
- Modified test statistic:
- z0 = \frac{\bar{p}1 - \bar{p}2}{\bar{p}{pooled}(1 - \bar{p}{pooled})(\frac{1}{n1} + \frac{1}{n_2})}
Decision Making
- p-value Criteria:
- For two-tailed: p-value = 2 × P(Z > z_0)
- For upper-tail: p-value = P(Z > z_0)
- For lower-tail: p-value = P(Z < z_0)
- Decision rule:
- If p-value ≤ \alpha, then reject H_0
- If p-value > \alpha, do not reject H_0
Example Continuation
- Proportions of smokers and non-smokers with wrinkles at \alpha=0.05:
- Test statistic:
- z0 = \frac{\bar{p}1 - \bar{p}2}{\bar{p}{pooled}(1 - \bar{p}{pooled})(\frac{1}{n1} + \frac{1}{n_2})} = 4.2
- Pooled proportion:
- \bar{p}_{pooled} = 0.5
- p-value calculation:
- p-value = 0
- Conclusion:
- Reject H_0, indicating a significant difference in proportions of smokers and non-smokers with wrinkles.