Theory-Based Inference for a Population Proportion (p)
Introduction & Course Logistics
- Welcome to PSTAT 5LS – Slide Set 5: Theory-Based Inference for p
- Transitioning from simulation-based inference to theory-based (normal-model) inference.
- Administrative reminders (Summer session examples*)
- HW 2 due Tue July 8 • 11:59 PM
- HW 3 due Mon July 14 • 11:59 PM
- Office hours (instructor example): T & R 2–3 PM, via Zoom
From Simulation to Theory
- Earlier slide set: simulated many samples under H_0 and plotted the resulting sample proportions.
- Dolphin–communication example and community-recycling example both showed bell-shaped histograms.
- Observation → Insight
- Repeated-sample behavior of \hat p resembles a normal curve → motivates normal approximation.
- A theoretical model lets us replace thousands of shuffles with a single formula-based calculation.
Sampling Distribution Fundamentals
- Sampling distribution = distribution of a statistic across all possible samples of fixed size n.
- Describes shape, center, spread attributable to random sampling ("chance alone").
- Key takeaway: If we know this distribution, we can quantify how unusual a single sample’s statistic is when H_0 is true.
Distribution of the Sample Proportion $\hat p$
- Center (mean): E[\hat p]=p (the true population proportion).
- Spread (standard error SE):
SE_{\hat p}=\sqrt{\dfrac{p(1-p)}{n}}
- Think of SE as a new ruler for measuring how far an observed \hat p sits from the hypothesised p.
Central Limit Theorem (CLT) for Proportions
- CLT assures that, under certain conditions, the sampling distribution of \hat p is approximately normal:
\hat p \sim N\Big(p,\; \dfrac{p(1-p)}{n}\Big) - Implication: We can use z-scores & normal probabilities for inference instead of re-simulating.
Conditions for Normal Approximation
- Independence
- Individual observations must not influence one another.
- Usually guaranteed by simple random sampling (SRS) or a well-designed experiment.
- Success–Failure Condition
- Under the model being checked (often H0), require
np0\ge 10 \quad\text{and}\quad n(1-p_0)\ge 10
- Ensures tails of the binomial are well-captured by the normal curve.
- "10" is empirical but widely accepted; larger thresholds tighten approximation.
Using the Normal Model in Hypothesis Testing
- In practice we don’t know p, so for SE we plug in the null value p0:
SE{H0}=\sqrt{\dfrac{p0(1-p_0)}{n}}
- Interpretation: We build the sampling distribution assuming H_0 is true; deviations are judged relative to that benchmark.
Independence: Deeper Considerations
- Random sampling → independence usually reasonable.
- Non-random scenarios (e.g.
cluster sampling, time series, social networks) → must justify or adjust (e.g.
finite-population correction, bootstrapping, or other models).
Success–Failure Revisited for Tests vs CIs
- Hypothesis test: check np0,\; n(1-p0) (expected counts) ≥ 10.
- Confidence interval: replace p_0 with observed \hat p.
Six Classic Steps in a One-Proportion z-Test
- State hypotheses $H0: p=p0, choose H_A (
- Check conditions (independence, success-failure).
- Compute test statistic
z=\dfrac{\hat p - p0}{\sqrt{p0(1-p_0)/n}} - Find p-value from the standard normal distribution.
- Decision: compare p-value to \alpha.
- Contextual conclusion in plain language.
- p_0 comes from the research question, not the data (e.g.
historical rate, theoretical expectation, regulatory threshold). - Possible alternatives:
- Left-tailed: HA: p < p0 (looking for a decrease).
- Right-tailed: HA: p > p0 (looking for an increase).
- Two-tailed: HA: p \neq p0 (any change).
Standardization & the 68-95-99.7 Rule Refresher
- Converting to a z-score (test statistic) expresses deviation in SE units.
- |z|=1 \Rightarrow (~68\%) of null sample proportions lie closer than you.
- |z|=2 \Rightarrow (~95\%) lie closer.
- |z|=3 \Rightarrow (~99.7\%) lie closer.
- For non-integer |z| (e.g.
1.75) need software for exact areas.
Computing p-Values with R’s pnorm()
- Generic syntax:
pnorm(q, mean=0, sd=1, lower.tail=TRUE)q = z-score (quantile).lower.tail=TRUE → P(Z \le q); FALSE → P(Z \ge q).
- Tail selection depends on H_A:
- Left-tailed (
- Right-tailed (>): p-value =
pnorm(z, lower.tail=FALSE). - Two-tailed (≠): p-value =
2*pnorm(|z|, lower.tail=FALSE).
Worked Example 1 – Dolphin Communication
- Data: Buzz guessed 15/16 correct → \hat p=0.9375.
- Hypotheses: H0:p=0.50 \quad vs \quad HA:p>0.50 (right-tailed).
- Test statistic:
z=\dfrac{0.9375-0.50}{\sqrt{0.50(1-0.50)/16}}=3.50 - p-value (R):
pnorm(3.50, lower.tail = FALSE) # 0.0002326
- Conclusion (e.g. with \alpha=0.05): Reject H_0 → convincing evidence dolphins communicate.
- Data: 530/800 households recycle → \hat p=0.6625.
- Hypotheses: H0:p=0.70 \quad vs \quad HA:p\neq0.70 (two-tailed).
- Test statistic:
z=\dfrac{0.6625-0.70}{\sqrt{0.70(1-0.70)/800}}=-2.315 - p-value (R):
2*pnorm(-2.315) # 0.0206
- With \alpha=0.05: Reject H_0 → recycling rate differs from 70 %.
Tail Direction & p-Value Recap
- p-value = probability of results as or more extreme in the direction of H_A.
- Make sure to multiply by 2 only for two-tailed tests.
Decision Rules & Statistical Significance
- If p\le\alpha → Reject H_0 (statistically significant).
- If p>\alpha → Fail to reject H_0 (not enough evidence).
- Common \alpha levels: 0.10, 0.05, 0.01 (context-driven; e.g.
medical trials often demand 0.01).
Practical & Ethical Considerations
- Adequate sample size → meets success-failure & yields useful power.
- Independence violations (e.g.
sampling classmates, social media followers) bias SE and inflate Type-I error. - Overreliance on p-values: Always accompany with effect size and confidence interval.
- Ethical reporting → state assumptions, limitations, and potential multiple-testing issues.
Connections & Next Steps
- Link to simulation: Theory approximates what we previously estimated via randomization.
- Upcoming lectures: theory-based confidence intervals for p, two-sample proportion tests, and extensions when conditions fail (e.g.
exact binomial tests, bootstrap CIs).