STAT1170 4: Sample Means + Confidence Intervals

Sampling Distributions for Proportions

Proportions are averages of dichotomous data (0 or 1).
In repeated sampling, sample proportions approach a Normal distribution if sample size n is sufficiently large.
Population proportion: p; Sample proportion: \hat{p}.

Conditions for Normal Distribution

Central Limit Theorem (CLT) requires:
- n p \geq 5
- n (1 - p) \geq 5
If both conditions are met, sample proportions approximate Normal, centered at p.
Standard error of sample proportions: \sigma_{\hat{p}} = \sqrt{\frac{p(1 - p)}{n}}.

Example Calculations

Population proportion of white cars: p = 0.4
In a sample of 25 cars, calculate \hat{p}:
- \hat{p} = \frac{12}{25} = 0.48
To find probabilities:
- Calculate z-score for \hat{p}:
  z = \frac{\hat{p} - p}{\sqrt{\frac{p(1 - p)}{n}}}.
- Probability of sample proportion at least 0.48: P(\hat{p} \geq 0.48) = 0.2071.

Confidence Intervals for Population Proportions

95% Confidence Interval (CI):
- When CLT applies, CI for population proportion:
  \hat{p} \pm 1.96 \times \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}}.
Example using Sydney teenagers:
- \hat{p} = \frac{216}{995} = 0.2171,
- CI: (0.191, 0.243).

Confidence Intervals for Population Mean

When \sigma is known:
- y \pm 1.96 \times \frac{\sigma}{\sqrt{n}}.
When \sigma is unknown:
- Use sample standard deviation s and t-distribution:
- y \pm t_{\alpha/2} \times \frac{s}{\sqrt{n}}.

Notable Points

Student’s t-distribution is used when \sigma is estimated.
It has heavier tails for small samples, adjusts with degrees of freedom (n - 1).
Ensure independence of observations for valid CI estimates.