Study Notes on Difference Between Two Proportions

1. Introduction to Comparing Two Proportions

  • Focus on comparing two populations through the proportions:
      - p1p_1 = proportion in group 1
      - p2p_2 = proportion in group 2

  • Goals:
      - Estimate: Formulate a confidence interval
      - Test: Conduct a hypothesis test

2. Sample Statistics

  • The statistics used for estimation:
      - ildep1=racx1n1ilde{p_1} = rac{x_1}{n_1}
      - ildep2=racx2n2ilde{p_2} = rac{x_2}{n_2}

  • Point Estimate: The difference between the two proportions:
      - ildep1ildep2ilde{p_1} - ilde{p_2}

3. Sampling Distribution of the Difference

  • Mean: The mean of the sampling distribution is p1p2p_1 - p_2

  • Standard Deviation: Calculated as
      - ext{SD} = ext{Standard Deviation} = ext{SD}( ilde{p_1} - ilde{p_2}) = ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ }
    ewline
      = ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ }
    ewline
       ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ }
    ewline
      ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ }
    ewline
      egin{cases} rac{p_1(1 - p_1)}{n_1} + rac{p_2(1 - p_2)}{n_2} ext{ if } p_1 ext{ and } p_2 ext{ are known}\ ext{Since } p_1 ext{ and } p_2 ext{ are unknown, use: } ext{ } ext{ } ext{ } ext{ }
    ewline ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{Standard Error (SE):} ext{ } ext{ } ext{ } ext{ }
    ewline ext{ } ext{ } ext{ } ext{ } ext{ } ext{ }
    ewline ext{SE} = ext{ } ext{ } ext{ } ext{ }
    ewline ext{ } ext{ } ext{ } ext{ } ext{ }
    ewline ext{ } ext{ } ext{ } ext{ } ext{ }
    ewline = ext{ } ext{ } ext{ } ext{ }
    ewline ext{ } ext{ } ext{ } ext{ } ext{ }
    ewline ext{ } ext{ } ext{ } ext{ } ext{ }
    ewline = rac{ ilde{p_1}(1 - ilde{p_1})}{n_1} + rac{ ilde{p_2}(1 - ilde{p_2})}{n_2} \ ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } & ext{ } ext{ } ext{ } ext{ } \ \ ext{ } ext{ } ext{ } ext{ } \ ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ \}

4. Conditions to Check Before Analysis

  • Randomness: Must utilize random samples or a randomized experiment.

  • Large Counts Requirement for both groups:
      - n_1 ilde{p_1} ext{ } ext{ } ext{ } ext{ } ext{ } extextextextextextextextextextextextextextext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ }{n_1(1- ilde{p_1}) ext{ } ext{ } }extextextext{ } ext{ } ext{ }{n_2 ilde{p_2} ext{ } ext{ } ext{ } ext{ } ext{ } (n_2(1 - ilde{p_2})} ext{ } ext{ } \

  • This condition ensures that the sampling distribution approaches normality.

5. Confidence Interval Formula

  • The formula for calculating the confidence interval is:
      - (ildep1ildep2)extextextext( ilde{p_1} - ilde{p_2}) ext{ } ext{ } ext{ } ext{ } ext{ } ext{ } ext{ }(zone)
      - extConfidenceInterval=(ildep1ildep2)ext±z(SE)ext{Confidence Interval} = ( ilde{p_1} - ilde{p_2}) ext{ ± z^*(SE) }
      - Where: SE = extextextextextSE=extextextext<br>ewlineracildep1(1ildep1)n1+racildep2(1ildep2)n2ext{ } ext{ } ext{ } ext{ } ext{SE} = ext{ } ext{ } ext{ } ext{ } <br>ewline rac{ ilde{p_1}(1 - ilde{p_1})}{n_1} + rac{ ilde{p_2}(1 - ilde{p_2})}{n_2}

6. Critical Values (z*)

  • Here are the critical values for common confidence levels:
      - 90%1.6451.645
      - 95%1.9601.960
      - 99%2.5762.576

7. Step-by-Step Process to Calculate CI

  • Step 1: Identify the proportions and sample sizes:
      - ildep1,ildep2,n1,n2ilde{p_1}, ilde{p_2}, n_1, n_2

  • Step 2: Verify conditions:
      - Check for randomness in samples
      - Ensure large counts in both groups

  • Step 3: Calculate necessary values:
      - Difference: Calculate ildep1ildep2ilde{p_1} - ilde{p_2}
      - Standard Error
      - Margin of Error: Compute using formula zimesSEz^* imes SE

  • Step 4: Write the confidence interval:
      - ildep1ildep2ext±MarginofErrorilde{p_1} - ilde{p_2} ext{ ± Margin of Error}

  • Step 5: Interpret the result:
      - Formulate the sentence: “We are % confident that the true difference in proportions (p1 − p2) is between and __.”

8. Interpreting Results

  • If the confidence interval includes 0:
      - There is no convincing difference between the proportions.

  • If the confidence interval does NOT include 0:
      - There is convincing evidence of a difference between the proportions.

  • Direction of the difference::
      - If the interval is entirely positive → p_1 > p_2
      - If the interval is entirely negative → p_1 < p_2

9. Examples and Logic

  • An example of interval: (0.137, 0.223)
      - The interval is all positive → indicates that group 1 is greater than group 2, leading to convincing evidence.

10. Order of Proportions

  • Order matters in the calculation of differences:
      - Switching between (p1 − p2) and (p2 − p1):
        - Changes the sign
        - Changes the meaning of the results

11. Common Mistakes to Avoid

  • DO NOT:
      - Forget to check both groups for conditions.
      - Use the incorrect order of subtraction (e.g., using p2p1p_2 - p_1 when it should be p1p2p_1 - p_2).
      - Neglect to include context in the final answer.
      - Incorrectly mention proportions as means.
      - Misinterpret what 0 means in the context of the confidence interval.

12. Using Technology (TI-84)

  • To calculate two proportion confidence intervals:
      - Navigate: STATTESTS2-PropZInt
      - Input required values:
        - x1,n1x_1, n_1
        - x2,n2x_2, n_2
        - Confidence level

  • Output:
      - The resultant interval will be displayed.

13. Big Concepts

  • The analysis involves:
      - Comparing two groups to determine if there is a significant difference in proportions.
      - Differentiating actual differences from sample noise.

14. Connection to Hypothesis Tests

  • The relationship between confidence intervals and hypothesis tests:
      - If CI excludes 0 → Reject the null hypothesis (H0H_0).
      - If CI includes 0 → Fail to reject the null hypothesis (H0H_0).
      - Confidence intervals and hypothesis tests will always agree in the conclusions they yield.

15. Cheat Sheet for Quick Reference

  • Memorize the following for quick application:
      - Difference: ildep1ildep2ilde{p_1} - ilde{p_2}
      - Conditions: Check BOTH groups.

  • Use the respective critical value zz^*.

  • Remember:
      - CI includes 0 = no difference
      - CI does not include 0 = evidence of a difference

16. Questions to Expect from Instructor

  • Key concepts your teacher may quiz you on:
      - Identification of ildep1ilde{p_1} and ildep2ilde{p_2}.
      - Verification of conditions for analysis.
      - Calculation of the confidence interval.
      - Interpretation of the confidence interval results.
      - Explanation of whether a significant difference exists between the proportions.