Lecture Note 09 Inference for Two Population Means Part I
Statistical Methods Overview
Course: STAT 3005
Instructor: Hamdy F. F. Mahmoud, PhD
Institution: Virginia Tech (VT)
Focus: Inference for two population means, 𝜇1 and 𝜇2.
Types of Samples
Simple Random Sample: Every member of the population has an equal chance of being selected.
Stratified Random Sample: Population is divided into subgroups (strata) and samples are taken from each.
Cluster Random Sample: Entire groups (clusters) are chosen randomly instead of individuals.
Multistage Random Sample: Combines several sampling methods, usually involves clusters followed by individual sampling.
Convenient Sample: Population members selected based on accessibility.
Volunteer Sample: Members self-select to participate in a study.
Types of Studies
Observational Study: The researcher observes but does not interfere with the population.
Experimental Study: The researcher actively manipulates variables to observe the effects.
Steps for Research Questions
Understanding: Population proportion, P, and its inference methods.
Types of inference:
Parametric Inference: Assuming data follows certain distributions.
Nonparametric Inference: No assumption about the distribution of the data.
Inference for Two Population Means
Focus areas:
Parametric Inference
Confidence Interval
Hypothesis Testing
Nonparametric Inference
Bootstrap Confidence Interval
Permutation Test of Hypotheses
Confidence Interval
Initiation of parametric inference.
Motivating Example
Two different exam versions were given to determine if there is a significant difference in difficulty.
Randomization used to distribute exams to students.
Goal: Evaluate if Version B is statistically harder than Version A based on student performance.
Numerical Summary & Conclusions
Point Estimate of the difference:
𝜇1 − 𝜇2 = 75.1 − 72 = 3.1
It is acknowledged that the point estimate may not reflect the true mean difference.
Conditions for Parametric Inference
Independence: Samples must be randomly selected.
Normality: Sample groups are assumed to come from normal distributions.
This assumption can be relaxed if groups are large enough or have no outliers.
If assumptions are met, use:
Point estimate of the difference: ҧ𝑥1 − ҧ𝑥2 follows normal distribution.
Confidence Interval formula:
(x̄1 − x̄2) ± z * √(σ1²/n1 + σ2²/n2)
Standard deviations (σ1², σ2²) are often unknown.
Confidence Interval Calculation
Complex degrees of freedom, often calculated with software, or using the smaller of (n1 - 1, n2 - 1) for manual calculations.
Use the t-distribution for confidence intervals in parametric tests:
Formulae include:
t*value from t-table
Margin of error
Standard error of point estimate.
Example Evaluation (Version B Exam)
Question: Can we apply parametric inference for comparing population means?
Actions:
Compute 99% confidence interval for the difference between means, 𝜇1 − 𝜇2.
Interpret this interval in practical context.
T-Distribution Critical Values Table
Provides critical t-values for different degrees of freedom (df) and confidence levels, essential for inference calculations (e.g., 80%, 90%, 95%, etc.).
Evaluating Results
Question: Does the calculated 99% confidence interval indicate that Version B was indeed harder?
Additional Example: Chicken Farming Study
An experiment to compare feed supplements on growth rates of chickens.
Random allocation into groups based on different feed supplements.
Decision: Can parametric inference be used?
Compute and interpret the 95% confidence interval for the population means.
Closing
Encourage questions to clarify understanding.
Upcoming Topics
Next focus: Testing hypotheses using parametric inference.