Theory-Based Inference for a Population Proportion (p)

Introduction & Course Logistics

Welcome to PSTAT 5LS – Slide Set 5: Theory-Based Inference for p
- Transitioning from simulation-based inference to theory-based (normal-model) inference.
Administrative reminders (Summer session examples*)
- HW 2 due Tue July 8 • 11:59 PM
- HW 3 due Mon July 14 • 11:59 PM
- Office hours (instructor example): T & R 2–3 PM, via Zoom

From Simulation to Theory

Earlier slide set: simulated many samples under H_0 and plotted the resulting sample proportions.
- Dolphin–communication example and community-recycling example both showed bell-shaped histograms.
Observation → Insight
- Repeated-sample behavior of \hat p resembles a normal curve → motivates normal approximation.
- A theoretical model lets us replace thousands of shuffles with a single formula-based calculation.

Sampling Distribution Fundamentals

Sampling distribution = distribution of a statistic across all possible samples of fixed size n.
- Describes shape, center, spread attributable to random sampling ("chance alone").
Key takeaway: If we know this distribution, we can quantify how unusual a single sample’s statistic is when H_0 is true.

Distribution of the Sample Proportion $\hat p$

Center (mean): E[\hat p]=p (the true population proportion).
Spread (standard error SE): SE_{\hat p}=\sqrt{\dfrac{p(1-p)}{n}}
- Think of SE as a new ruler for measuring how far an observed \hat p sits from the hypothesised p.

Central Limit Theorem (CLT) for Proportions

CLT assures that, under certain conditions, the sampling distribution of \hat p is approximately normal:
\hat p \sim N\Big(p,\; \dfrac{p(1-p)}{n}\Big)
Implication: We can use z-scores & normal probabilities for inference instead of re-simulating.

Conditions for Normal Approximation

Independence
- Individual observations must not influence one another.
- Usually guaranteed by simple random sampling (SRS) or a well-designed experiment.
Success–Failure Condition
- Under the model being checked (often H0), require np0\ge 10 \quad\text{and}\quad n(1-p_0)\ge 10
- Ensures tails of the binomial are well-captured by the normal curve.
- "10" is empirical but widely accepted; larger thresholds tighten approximation.

Using the Normal Model in Hypothesis Testing

In practice we don’t know p, so for SE we plug in the null value p0: SE{H0}=\sqrt{\dfrac{p0(1-p_0)}{n}}
Interpretation: We build the sampling distribution assuming H_0 is true; deviations are judged relative to that benchmark.

Independence: Deeper Considerations

Random sampling → independence usually reasonable.
Non-random scenarios (e.g.
cluster sampling, time series, social networks) → must justify or adjust (e.g.
finite-population correction, bootstrapping, or other models).

Success–Failure Revisited for Tests vs CIs

Hypothesis test: check np0,\; n(1-p0) (expected counts) ≥ 10.
Confidence interval: replace p_0 with observed \hat p.

Six Classic Steps in a One-Proportion z-Test

State hypotheses $H0: p=p0, choose H_A (
Check conditions (independence, success-failure).
Compute test statistic
z=\dfrac{\hat p - p0}{\sqrt{p0(1-p_0)/n}}
Find p-value from the standard normal distribution.
Decision: compare p-value to \alpha.
Contextual conclusion in plain language.

Formulating Hypotheses Correctly

p_0 comes from the research question, not the data (e.g.
historical rate, theoretical expectation, regulatory threshold).
Possible alternatives:
- Left-tailed: HA: p < p0 (looking for a decrease).
- Right-tailed: HA: p > p0 (looking for an increase).
- Two-tailed: HA: p \neq p0 (any change).

Standardization & the 68-95-99.7 Rule Refresher

Converting to a z-score (test statistic) expresses deviation in SE units.
- |z|=1 \Rightarrow (~68\%) of null sample proportions lie closer than you.
- |z|=2 \Rightarrow (~95\%) lie closer.
- |z|=3 \Rightarrow (~99.7\%) lie closer.
For non-integer |z| (e.g.
1.75) need software for exact areas.

Computing p-Values with R’s pnorm()

Generic syntax: pnorm(q, mean=0, sd=1, lower.tail=TRUE)
- q = z-score (quantile).
- lower.tail=TRUE → P(Z \le q); FALSE → P(Z \ge q).
Tail selection depends on H_A:
- Left-tailed (
- Right-tailed (>): p-value = pnorm(z, lower.tail=FALSE).
- Two-tailed (≠): p-value = 2*pnorm(|z|, lower.tail=FALSE).

Worked Example 1 – Dolphin Communication

Data: Buzz guessed 15/16 correct → \hat p=0.9375.
Hypotheses: H0:p=0.50 \quad vs \quad HA:p>0.50 (right-tailed).
Test statistic:
z=\dfrac{0.9375-0.50}{\sqrt{0.50(1-0.50)/16}}=3.50
p-value (R):

  pnorm(3.50, lower.tail = FALSE)  # 0.0002326

Conclusion (e.g. with \alpha=0.05): Reject H_0 → convincing evidence dolphins communicate.

Worked Example 2 – Community Recycling

Data: 530/800 households recycle → \hat p=0.6625.
Hypotheses: H0:p=0.70 \quad vs \quad HA:p\neq0.70 (two-tailed).
Test statistic:
z=\dfrac{0.6625-0.70}{\sqrt{0.70(1-0.70)/800}}=-2.315
p-value (R):

  2*pnorm(-2.315)  # 0.0206

With \alpha=0.05: Reject H_0 → recycling rate differs from 70 %.

Tail Direction & p-Value Recap

p-value = probability of results as or more extreme in the direction of H_A.
- Make sure to multiply by 2 only for two-tailed tests.

Decision Rules & Statistical Significance

If p\le\alpha → Reject H_0 (statistically significant).
If p>\alpha → Fail to reject H_0 (not enough evidence).
Common \alpha levels: 0.10, 0.05, 0.01 (context-driven; e.g.
medical trials often demand 0.01).

Practical & Ethical Considerations

Adequate sample size → meets success-failure & yields useful power.
Independence violations (e.g.
sampling classmates, social media followers) bias SE and inflate Type-I error.
Overreliance on p-values: Always accompany with effect size and confidence interval.
Ethical reporting → state assumptions, limitations, and potential multiple-testing issues.

Connections & Next Steps

Link to simulation: Theory approximates what we previously estimated via randomization.
Upcoming lectures: theory-based confidence intervals for p, two-sample proportion tests, and extensions when conditions fail (e.g.
exact binomial tests, bootstrap CIs).