Chapter 22 Notes: Hypothesis Testing and Confidence Intervals for Two Proportions

Comparisons conducted between two different percentages or proportions are significantly more common in statistical analysis than questions concerning isolated percentages.
Comparative analysis is often considered more interesting and informative than single-proportion analysis.
Typical research questions involving two proportions include: * Determining how two distinct groups differ from one another. * Evaluating whether a specific treatment is more effective than a placebo control. * Assessing if results from the current year show improvement compared to results from the previous year.

A Gallup Poll was conducted to investigate potential differences in opinions between genders regarding intelligence.
Sample Specifications: * The poll selected a random sample of $520$ women. * The poll selected a random sample of $506$ men.
The Research Question: The study aimed to determine if there is a "gender gap" in opinions concerning which sex is smarter.
Observed Results: * $28\%$ of the men surveyed believed that men were generally more intelligent. * Only $14\%$ of the women surveyed agreed with the sentiment that men were more intelligent.

To examine the difference between two sample proportions ( $\hat{p}_1 - \hat{p}_2$ ), a specific "ruler" or metric is required.
This metric is the standard deviation of the sampling distribution model for the difference between those two proportions.
Key Statistical Rule for Variance: * It is critical to recall that standard deviations cannot be added directly. * However, variances are additive. * The variance of the sum or the difference of two independent random quantities is equal to the sum of their individual variances.
Independence Requirement: * Proportions observed in independent random samples are considered independent themselves. * Because they are independent, it is mathematically valid to add their variances to find the variance of the difference.

Standard Deviation for the Difference ( $SD$ ): * This formula is typically used for hypothesis testing (Hyp Test) or when population parameters are conjectured. * The standard deviation of the difference between two sample proportions is defined as: * $SD(\hat{p}_1 - \hat{p}_2) = \sqrt{\frac{p_1 q_1}{n_1} + \frac{p_2 q_2}{n_2}}$
Standard Error for the Difference ( $SE$ ): * This formula is utilized when constructing Confidence Intervals ( $CI$ ), where population proportions are unknown and sample estimates must be used. * The standard error of the difference between two sample proportions is defined as: * $SE(\hat{p}_1 - \hat{p}_2) = \sqrt{\frac{\hat{p}_1 \hat{q}_1}{n_1} + \frac{\hat{p}_2 \hat{q}_2}{n_2}}$