Lecture Note 09 Inference for Two Population Means Part I

Statistical Methods Overview

  • Course: STAT 3005

  • Instructor: Hamdy F. F. Mahmoud, PhD

  • Institution: Virginia Tech (VT)

  • Focus: Inference for two population means, 𝜇1 and 𝜇2.

Types of Samples

  • Simple Random Sample: Every member of the population has an equal chance of being selected.

  • Stratified Random Sample: Population is divided into subgroups (strata) and samples are taken from each.

  • Cluster Random Sample: Entire groups (clusters) are chosen randomly instead of individuals.

  • Multistage Random Sample: Combines several sampling methods, usually involves clusters followed by individual sampling.

  • Convenient Sample: Population members selected based on accessibility.

  • Volunteer Sample: Members self-select to participate in a study.

Types of Studies

  • Observational Study: The researcher observes but does not interfere with the population.

  • Experimental Study: The researcher actively manipulates variables to observe the effects.

Steps for Research Questions

  • Understanding: Population proportion, P, and its inference methods.

  • Types of inference:

    • Parametric Inference: Assuming data follows certain distributions.

    • Nonparametric Inference: No assumption about the distribution of the data.

Inference for Two Population Means

  • Focus areas:

    • Parametric Inference

      • Confidence Interval

      • Hypothesis Testing

    • Nonparametric Inference

      • Bootstrap Confidence Interval

      • Permutation Test of Hypotheses

Confidence Interval

  • Initiation of parametric inference.

Motivating Example

  • Two different exam versions were given to determine if there is a significant difference in difficulty.

  • Randomization used to distribute exams to students.

  • Goal: Evaluate if Version B is statistically harder than Version A based on student performance.

Numerical Summary & Conclusions

  • Point Estimate of the difference:

    • 𝜇1 − 𝜇2 = 75.1 − 72 = 3.1

  • It is acknowledged that the point estimate may not reflect the true mean difference.

Conditions for Parametric Inference

  1. Independence: Samples must be randomly selected.

  2. Normality: Sample groups are assumed to come from normal distributions.

    • This assumption can be relaxed if groups are large enough or have no outliers.

  • If assumptions are met, use:

    • Point estimate of the difference: ҧ𝑥1 − ҧ𝑥2 follows normal distribution.

    • Confidence Interval formula:

      • (x̄1 − x̄2) ± z * √(σ1²/n1 + σ2²/n2)

  • Standard deviations (σ1², σ2²) are often unknown.

Confidence Interval Calculation

  • Complex degrees of freedom, often calculated with software, or using the smaller of (n1 - 1, n2 - 1) for manual calculations.

  • Use the t-distribution for confidence intervals in parametric tests:

    • Formulae include:

      • t*value from t-table

      • Margin of error

      • Standard error of point estimate.

Example Evaluation (Version B Exam)

  • Question: Can we apply parametric inference for comparing population means?

  • Actions:

    • Compute 99% confidence interval for the difference between means, 𝜇1 − 𝜇2.

    • Interpret this interval in practical context.

T-Distribution Critical Values Table

  • Provides critical t-values for different degrees of freedom (df) and confidence levels, essential for inference calculations (e.g., 80%, 90%, 95%, etc.).

Evaluating Results

  • Question: Does the calculated 99% confidence interval indicate that Version B was indeed harder?

Additional Example: Chicken Farming Study

  • An experiment to compare feed supplements on growth rates of chickens.

  • Random allocation into groups based on different feed supplements.

  • Decision: Can parametric inference be used?

  • Compute and interpret the 95% confidence interval for the population means.

Closing

  • Encourage questions to clarify understanding.

Upcoming Topics

  • Next focus: Testing hypotheses using parametric inference.