Simulation-Based Inference for p – Comprehensive Study Notes

Announcements & Course Logistics

  • Today’s topic: Simulation-Based Inference for a Population Proportion (p)(p) (starting at slide 13 of Set 3)
  • Next class: Normal Distributions (Slide Set 4)
  • Upcoming homework deadlines
    • HW 2: Tuesday, July 8, 11:59 PM
    • HW 3: Friday, July 11, 11:59 PM
  • Instructor office hours
    • Tuesdays & Thursdays, 2–3 PM via Zoom
    • Encouraged to visit for questions on material, coding, or assignments

Big Picture of Simulation-Based Inference for $p$

  • Goal: Decide whether observed sample evidence suggests the population proportion differs from a hypothesized value (p0)(p_0)
  • Strategy
    1. Model the null hypothesis (H<em>0:p=p</em>0)(H<em>0 : p = p</em>0) with a chance process
    2. Simulate many random samples under that model
    3. Compare the observed statistic (p^)(\hat p) to the simulated distribution
    4. Quantify extremeness via the p-value (proportion of simulations at least as extreme as observed)

Modeling the Situation (Community Recycling Example)

  • Research question: Does the proportion of adults who recycle equal 0.700.70?
  • Null hypothesis: H0:p=0.70H_0 : p = 0.70 (70 % of all adults recycle)
  • Alternative: Ha:p0.70H_a : p \neq 0.70 (two-sided because no prior directional claim)
  • Physical model
    • Use colored poker chips to represent “recycle” vs “not recycle”
    • Blue chip = recycler, yellow chip = non-recycler
    • Bag composition reflects p0p_0: 70 % blue, 30 % yellow
    • Ten-chip miniature model: 7 blue, 3 yellow (maintains 7∶3 ratio)
  • Sampling correspondence
    • One drawone survey respondent
    • One repetition (800 draws)one entire sample of 800 adults
    • Drawing with replacement keeps the population proportion constant and permits enough draws
  • Note on earlier example discrepancy
    • A slide referenced 1 207 draws (previous button-pressing context); current recycling study uses 800 draws

Observed Data

  • Survey of 800 U.S. adults
    • 530 said they recycle
    • Sample proportion: p^=530800=0.6625\hat p = \frac{530}{800} = 0.6625 (66.25 %)

Computer Simulation Procedure

  • R helper function: simulate_chance_model(chanceSuccess = 0.70, numDraws = 800, numRepetitions = 5000)
    • chanceSuccess: p0=0.70p_0 = 0.70
    • numDraws: sample size n=800n = 800 per repetition
    • numRepetitions: 5 000 synthetic samples
  • Output stored in object sim1 containing the 5 000 simulated sample proportions
  • Visualization: Histogram of the 5 000 p^\hat p values shows a roughly bell-shaped distribution centered at 0.700.70, spanning ≈ 0.640.640.760.76

Evaluating the Results (Calculating the p-value)

  • For two-sided test, “as extreme” means p^sim0.700.66250.70=0.0375|\hat p_{sim} - 0.70| \ge |0.6625 - 0.70| = 0.0375
    • Left-tail cutoff: 0.700.0375=0.66250.70 - 0.0375 = 0.6625
    • Right-tail cutoff: 0.70+0.0375=0.73750.70 + 0.0375 = 0.7375
  • R code snippets presented
    • sum(sim1 <= 530/800) returned 62 simulated samples in left tail
    • sum(sim1 >= 0.70 + (0.70 - 530/800)) returned 46 simulated samples in right tail
    • Total extreme counts: 62+46=10862 + 46 = 108
  • Estimated p-value: p-value=1085000=0.0216p\text{-value} = \frac{108}{5000} = 0.0216
    • Interpretation: 2.16 % chance of obtaining p^\hat p at least 0.03750.0375 away from 0.700.70 under H0H_0

Interpreting the p-value

  • Small p-value (≈ 0.02) ⇒ observed sample proportion is unusual if p=0.70p = 0.70
  • Conclusion at common significance level α=0.05\alpha = 0.05:
    • Since 0.0216 < 0.05, reject H0H_0
    • Evidence suggests the true recycling proportion differs from 70 %
  • Practical wording: “We have strong evidence that the population proportion of recyclers is not 70 %.”

Vocabulary Check: Three Different “p”s

  • pppopulation proportion (parameter)
  • p^\hat psample proportion (statistic)
  • p-value — probability of observing a statistic at least as extreme as the one obtained, assuming H0H_0 is correct

Role of the Alternative Hypothesis in Tail Choice

  • Direction determines which simulated values count as “extreme”
    • H<em>a:p<p</em>0H<em>a : p < p</em>0 ⇒ left tail only
    • H<em>a:p>p</em>0H<em>a : p > p</em>0 ⇒ right tail only
    • H<em>a:pp</em>0H<em>a : p \neq p</em>0 ⇒ both tails (twice the extremeness)
  • Always decide one- vs two-sided before seeing data; post-hoc switching inflates Type I error

Decision-Making with Significance Level α\alpha

  • Typical thresholds: α=0.10,0.05,0.01\alpha = 0.10, 0.05, 0.01
  • Decision rules
    • p-valueα\text{p-value} \le \alphaReject H0H_0 ⇒ results called “statistically significant”
    • \text{p-value} > \alpha ⇒ Fail to reject H0H_0 ⇒ insufficient evidence
  • Preferred instructor framing: “strength of evidence” rather than rigid significant/not significant labels

Consequences of Failing to Reject H0H_0

  • Does not prove H<em>0H<em>0 is true—merely that data were plausible under H</em>0H</em>0
  • Next steps when H0H_0 isn’t rejected
    • Examine sample size / power
    • Consider whether smaller effects still matter practically
    • Plan a follow-up study with larger nn or improved design
    • Report findings transparently, noting limitations
  • Example cited: Jenner & Jenner (2007) after-school program study found no significant test score gains but acknowledged potential other benefits

Worked Examples Recap

  • Buzz’s Button-Pressing Game (earlier lecture context)
    • Estimated p-value ≈ 0 ⇒ reject H0H_0; strong evidence Buzz wasn’t guessing
  • Recycling Proportion
    • p-value =0.0216= 0.0216 ⇒ reject H0H_0; strong evidence true proportion ≠ 70 %

Summary of Simulation-Based Significance Testing Procedure

  1. Observe the sample statistic p^\hat p
  2. Simulate many samples under H0H_0 to build the null distribution
  3. Assess extremeness of p^\hat p within that distribution via the p-value
  4. Decide: small p-value ⇒ evidence against H<em>0H<em>0; large p-value ⇒ data consistent with H</em>0H</em>0

Final Remarks & Course Outlook

  • Core ideas learned here (model, simulate, compare, decide) will extend to:
    • Means, differences, regression, multiple proportions, etc.
  • Mastery of these basics ensures preparedness for upcoming topics (e.g., Normal distribution theory next lecture)
  • Always pre-specify hypotheses and significance levels; avoid p-hacking