Study Notes on Chapter 5: Inferences on Population Means
Chapter 5: Inferences On One Or Two Population Means
5.1 Confidence Intervals for Means
- Central Limit Theorem: For a large sample size (n ≥ 25-30), the sample mean (X) is approximately normally distributed with mean (μ).
- Confidence Intervals:
- About 95.45% of the time, the sample mean will lie within the interval: μ ± 2σ/√n
- About 99.7% of the time, it will lie within μ ± 3σ/√n
- Confidence Level:
- Denoted by (1 - α), where α is the significance level (e.g., 0.05 for a 95% confidence level).
- As (1 - α) approaches 1, confidence in the interval containing the mean increases.
- The critical value zα/2 corresponds to the (1 - α/2) × 100th percentile of the standard normal distribution.
Example 53: Simulation of Human Gestation Periods
- Population mean: μ = 266 days, standard deviation: σ = 16 days.
- Simulation conducted using R for 10,000 confidence intervals (n = 40) at α = 0.05.
Example 54: Help Doc Brown
- Sample mean (X̄): 0.716; standard deviation: 0.07944; sample size (n): 142.
- 95% CI Calculation:
- 95% critical value: Zα/2 = 1.95996, resulting CI: (0.7029, 0.7291)
- 98% CI Calculation:
- Zα/2 = 2.3263, resulting CI: (0.7049 ± [0.0136])
### Margin of Error (E)
[ E = z_{α/2} \frac{σ}{\sqrt{n}} ]
- Minimum required sample size for given margin of error E:
[ n = \left( \frac{z_{α/2} · σ}{E} \right)^2 ]
Example 55: Required Sample Size
- Margin of error E = 9 days, standard deviation σ = 16 days, for a 95% CI.
- Required sample size calculation to remain within the margin of error covers how sample size relates to confidence intervals.
Relationships
- As the margin of error increases, the width of the confidence interval increases.
- As the confidence level decreases, the width of the interval decreases.
- As sample size increases, the width of the interval decreases.
5.2 Intro to Hypothesis Testing
- Two hypotheses indicated:
- Null Hypothesis (H₀): Status quo.
- Alternative Hypothesis (H₁): New claim to be tested.
- Steps for Hypothesis Testing:
- Assume H₀ is true, take a sample.
- Compute a test statistic.
- Determine if the test statistic is in the rejection region (RR) to reject H₀ or not.
- Example 56: Your professor's claim of an IQ of at least 140 tested with sample statistics showing if various observed IQs lead to rejection or acceptance of H₀.
5.3 Hypothesis Tests for Population Means (σ Known)
- When population standard deviation (σ) is known, conduct right-tailed, left-tailed, or two-tailed tests based on sample mean (X̄).
- Test Statistics Calculations:
- Right-tailed test:
- H₀: μ ≤ μ₀, H₁: μ > μ₀
- Left-tailed test builds on similar methodology.
- Compute p-value: probability of obtaining the observed test statistic given that H₀ is true.
- Examples 57 & 58 detail specific case testing for soybean yields and cholesterol levels in immigrants, illustrating test statistic computation and resultant p-values leading to conclusion statements.
5.4 Choosing Sample Size
- The goal is to achieve small values for both α and β (type II error). Minimum required sample size for Type II error manageable based on hypothesis testing model adjusted to save time and resources.
- The formula adjusts based on the desired power and β considerations,
[ n = \left( \frac{Z{a} + Z{β}}{μ₁ - μ₀} \right)^2 ]
5.5 Inferences About μ When σ is Unknown
- When σ is unknown, use the t-distribution for hypothesis testing and confidence intervals:
- The behavior of the sample mean approaches normality as n increases thanks to the Central Limit Theorem.
- Key Formula:
[ T = \frac{X̄ - μ₀}{S/\sqrt{n}} ] where S is the sample standard deviation. - Example 63: Salamander Lengths: Illustrates the use of t-distribution and subsequently computes the 95% confidence interval finding actual mean of salamander length using sample data.
Important Notes
- The t-distribution is used when sampling from normal distributions with small samples or large samples where σ is unknown.
- As degrees of freedom increase, the t-distribution approaches a normal distribution.
- Conclusions drawn from hypothesis testing depend heavily on p-values, which signify the strength of the evidence against H₀.