Notes on Hypothesis Tests for 2 Proportions
Comparing Two Proportions
- Estimating Difference Between Two Proportions:
- Example: Comparing admission rates at ages 17 and 18.
- Statistic used: Difference in sample proportions
- Population proportions:
- Population 1: p<em>1, size n</em>1, sample proportion pˉ1
- Population 2: p<em>2, size n</em>2, sample proportion pˉ2
Sampling Distribution of Proportions
- For independent samples of sizes n<em>1 and n</em>2 from populations with parameters p<em>1 and p</em>2:
- Mean:
- extMeanofthesamplingdistributionpˉ=p
- extMeanof(pˉ<em>1−pˉ</em>2)=p<em>1−p</em>2
- Standard Deviation:
- extStd.deviationofpˉ=np(1−p)
- extStd.deviationof(pˉ<em>1−pˉ</em>2)=n<em>1p<em>1(1−p</em>1)+n</em>2p</em>2(1−p<em>2)
- Normality Conditions:
- npextandn(1−p)extshouldbeext≥10 for both samples
Assumptions and Conditions
- Independence Observations:
- Randomization Condition:
- Data drawn independently and randomly from a homogeneous population.
- 10% Condition:
- Sample should not exceed 10% of population when sampled without replacement.
- Independent Groups:
- Two groups must be independent.
- Sample Size Condition:
- Each group must be sufficiently large.
- Success/Failure Condition:
- n<em>1p</em>1extandn<em>1(1−p</em>1)ext≥10
- n<em>2p</em>2extandn<em>2(1−p</em>2)ext≥10
Confidence Interval for 2 Population Proportions
- Formula:
- pˉ<em>1−pˉ</em>2ext±z∗n<em>1pˉ<em>1(1−pˉ</em>1)+n</em>2pˉ</em>2(1−pˉ<em>2)
- Example:
- Smokers (n1=150): 95 with prominent wrinkles, pˉ1=0.63
- Nonsmokers (n2=250): 105 with prominent wrinkles, pˉ2=0.42
- 95% CI for smokers:
- 0.63±1.96×0.0394=(0.55,0.71)
- 95% CI for nonsmokers:
- 0.42±1.96×0.0312=(0.36,0.48)
- Check for overlap in intervals: indicates proportion differences.
Two-Proportion z Test
- Hypotheses:
- Two-tailed: H<em>0:p</em>1−p<em>2=p</em>0 vs H<em>a:p</em>1−p<em>2=p</em>0
- Upper-tailed: H<em>0:p</em>1−p<em>2=p</em>0 vs H<em>a:p</em>1−p<em>2>p</em>0
- Lower-tailed: H<em>0:p</em>1−p<em>2=p</em>0 vs H<em>a:p</em>1−p<em>2<p</em>0
- Assumption & Conditions:
- Random samples, independent observations, large sizes
- Normal Conditions: n<em>1p</em>1,n<em>1(1−p</em>1),n<em>2p</em>2,n<em>2(1−p</em>2)ext≥10
- Test Statistic:
- z<em>0=n</em>1p</em>1(1−p<em>1)+n2p<em>2(1−p</em>2)pˉ</em>1−pˉ<em>2−(p</em>1−p<em>2)
- If null is true, pool the proportions:
- pˉ<em>pooled=n</em>1+n2n</em>1pˉ<em>1+n</em>2pˉ<em>2
- Modified test statistic:
- z<em>0=pˉ</em>pooled(1−pˉ<em>pooled)(n</em>11+n21)pˉ</em>1−pˉ<em>2
Decision Making
- p-value Criteria:
- For two-tailed: p-value = 2 × P(Z > z_0)
- For upper-tail: p-value = P(Z > z_0)
- For lower-tail: p-value = P(Z < z_0)
- Decision rule:
- If p−value≤α, then reject H0
- If p-value > \alpha, do not reject H0
Example Continuation
- Proportions of smokers and non-smokers with wrinkles at α=0.05:
- Test statistic:
- z<em>0=pˉ</em>pooled(1−pˉ<em>pooled)(n</em>11+n21)pˉ</em>1−pˉ<em>2=4.2
- Pooled proportion:
- pˉpooled=0.5
- p-value calculation:
- p−value=0
- Conclusion:
- Reject H0, indicating a significant difference in proportions of smokers and non-smokers with wrinkles.