Hypothesis Testing for Two Samples Study Notes

Overview of Hypothesis Testing for Two Samples

  • In the previous chapter, hypothesis testing was focused on one sample. The current chapter shifts to hypothesis testing of two samples, expanding the procedure to two populations.

  • Key sections include:

    • 10.1: Two population means when population standard deviation is unknown

    • 10.2: Two population means when population standard deviation is known (not covered)

    • 10.3: Comparing two independent population proportions

    • 10.4: Matched or paired samples

Hypotheses in Two Sample Tests

  • The null hypothesis (
    H0) states there is no difference between groups (e.g.  = ).

  • The alternative hypothesis (
    H1) posits there is a difference (e.g.  ≠ ).

  • Comparison focuses on two groups for possible real differences or chance variability.

Independent vs. Matched Pairs

  • Independent Groups: Two samples where the selection of one does not affect the other.

  • Matched Pairs: Two samples that depend on each other. Here, the selection for one group influences the selection for the other.

Example of Hypotheses Setup

  • Scenario: Comparing mean ages of nursing students between those at a community college and a university:

    • First group (Community College): Mean = 

    • Second group (University): Mean = 

    • Null Hypothesis:  = 

    • Alternative Hypothesis:  ≠  (or  -  ≠ 0 for matched pairs)

Section 10.1 - Two Population Means with Unknown Standard Deviations

  • Assumptions for testing population means when standard deviations are unknown:

    • Samples must be independent and random from distinct populations.

    • Sample sizes must be ≥ 30 or the populations must approximate normal distribution.

  • Test Used: Two-sample t-test for means, often referred to as the Welch t-test. This accounts for unequal variances.

  • Formula for Standard Error:
    SE=s<em>12n</em>1+s<em>22n</em>2SE = \sqrt{\frac{s<em>1^2}{n</em>1} + \frac{s<em>2^2}{n</em>2}}

  • Formula for T-Score:
    t=(xˉ<em>1xˉ</em>2)SEt = \frac{(\bar{x}<em>1 - \bar{x}</em>2)}{SE}

  • Degrees of Freedom Calculation: More complex than normal $ n - 1 $. Software/tools (like Excel) are recommended for accurate calculation.

Example of Two Population Means Test

  • Scenario: Real estate comparison between two municipalities:

    • Null Hypothesis:  = 

    • Alternative Hypothesis:  ≠ 

    • Results: T-value = -3.82, P-value = 0.0003.

    • Decision: Reject the null hypothesis since P-value < 0.1, concluding a significant difference in average home prices.

Example of Salary Comparison

  • Elementary vs. Secondary Teacher Salaries:

    • Null Hypothesis:  = 

    • Alternative Hypothesis:  > 

    • Results: T-score = 1.92, P-value = 0.0308.

    • Decision: Fail to reject H0 since 0.0308 > 0.01, insufficient evidence to support the claim of higher salaries for elementary teachers.

Section 10.3 - Comparing Two Independent Population Proportions

  • Assumptions for Proportion Testing:

    • Two independent random samples from distinct populations.

    • Number of successes and failures in each sample must both be ≥ 5.

  • Test Used: Two-sample Z-test for proportions.

  • Formula for Pooled Proportion:
    P=x<em>1+x</em>2n<em>1+n</em>2P = \frac{x<em>1 + x</em>2}{n<em>1 + n</em>2}

  • Z-Score Formula: Will be computed using software or tools.

Example of Proportion Claim

  • Scenario: Coupon clipping habits of women vs. men:

    • Null Hypothesis: P1 = P2

    • Alternative Hypothesis: P1 > P2

    • Results: Z-score = 2.96, P-value = 0.0015.

    • Decision: Reject the null hypothesis since P-value < 0.01, supporting the claim that women clip coupons more than men.

Conclusion on Practical vs. Statistical Significance

  • Always contextualize results in conclusions – who, what, when, where.

  • Both statistical significance (tests show a difference) and practical significance (importance of difference) should be analyzed.

  • Assumptions must be checked before any hypothesis testing is conducted, especially sampling methods and distributions.

Ethical Considerations

  • Ensure randomization in studies to uphold integrity in results.

  • Be cautious in drawing conclusions, especially in practical applications of statistically significant findings as they may not always correlate to real-world importance.