Module 9 - Inferences for Two Population Means - Annotated
Module 9 Inferences for Two Population Means
Instructor: Mathieu Chalifour, MacEwan University
Comparing Two Population Means
Introduction to inferential comparisons based on sample data.
Hypothesis tests and confidence intervals will be used involving the T-distribution.
Questions to consider:
Are the means different? (μ1 − μ2 ≠ 0)
Is the mean of population 1 larger than population 2? (μ1 − μ2 > 0)
Is the mean of population 1 smaller than population 2? (μ1 − μ2 < 0)
Independent Samples
Definition: Samples from population 1 that are independent of samples from population 2.
The difference of their means is the estimator: (X̄1 − X̄2).
Notation:
Sample means: X̄1, X̄2
Sample standard deviations: s1, s2
Sample sizes: n1, n2
Parameters of the Sampling Distribution
Mean: An unbiased estimator (μX̄1−X̄2 = μ1 − μ2).
Variance and Standard Deviation: Variance σ²X̄1−X̄2 = σ²X̄1 + σ²X̄2 = (σ²1/n1 + σ²2/n2) leading to:
Standard deviation calculation: σX̄1−X̄2 = √(σ²1/n1 + σ²2/n2).
Sampling Distribution Shape
Determined by: Individual sampling distributions and their population shapes.
If the populations are not normally distributed or sample sizes are small, then:
For large samples (n1 ≥ 30 and n2 ≥ 30), the Central Limit Theorem (CLT) applies.
Shape of the distributions becomes normally distributed:
X̄1 and X̄2 ∼ N(μX̄1, σX̄1) and N(μX̄2, σX̄2).
T-distribution
Difference of means follows a normal distribution.
Standardization leads to a T-distribution variable:
T = (X̄1 − X̄2) − (μ1 − μ2) / √(s²1/n1 + s²2/n2).
Degrees of freedom (df): Calculated for T-distribution using:
df = (s²1/n1 + s²2/n2)² / (( (1/n1 − 1)(s²1/n1)²) + ((1/n2 − 1)(s²2/n2)²)).
Necessary Assumptions for Two Population Means Inferences
Both samples should be simple random samples (SRS).
Populations must be independent.
Population data must be normally distributed or sufficiently large for CLT:
n1 ≥ 30 and n2 ≥ 30.
Hypothesis Testing for Two Independent Samples
Types of tests:
Two-tailed test: H0 : μ1 − μ2 = 0, HA : μ1 − μ2 ≠ 0.
Right-tailed test: H0 : μ1 − μ2 = 0, HA : μ1 − μ2 > 0.
Left-tailed test: H0 : μ1 − μ2 = 0, HA : μ1 − μ2 < 0.
Example Case Study: Vitamin Effect on Recovery Time
A drug company claims its vitamin reduces recovery time from the common cold.
Study structure:
70 participants: 35 given a placebo, 35 given a vitamin supplement.
Measure recovery time in days.
Testing Claim of the Vitamin Supplement
Check Assumptions: SRS, independent populations, normality or n1 & n2 ≥ 30.
Hypothesis formation:
H0 : μ1 − μ2 = 0 vs HA : μ1 − μ2 < 0.
Test statistic calculation:
Calculate t0 and degrees of freedom.
P-value Approach
p-value must be compared with significance level (α = 0.025).
Decision criteria for rejecting the null hypothesis H0 based on p-value.
Conclusion
At a 2.5% significance level:
The data supports the claim that the vitamin reduces recovery time from a common cold.
Implications for further studies.
Two-sample T-Confidence Intervals for μ1 − μ2
Construction of a 95% confidence interval for mean difference in recovery times.
Confidence interval format:
(X̄1 − X̄2) ± t(α/2) × √(s²1/n1 + s²2/n2).
Confidence Interval Example Calculation
For vitamin vs placebo:
Mean recovery times: 5.8 vs 6.9 days.
Applied calculations yield: (−2.17, −0.03) confidence interval suggests vitamin's efficacy.
Paired Samples
Definition: Measurements are paired, each sample has a corresponding match.
Example study on sleep times among children vs adults.
Analyzing Paired Samples
Paired differences inform if a treatment is effective or leads to significance.
Use T-distribution for calculating paired differences:
Hypothesis tests and confidence intervals follow processes similar to one-sample tests.
Decision Process in Hypothesis Testing
Assumptions check and hypothesis definition.
Statistical calculation to find test statistic and p-value.
Decision making based on significance level, justifications provided.
Conclusion drawn to either support or reject the hypothesis with implications for future practices.