PSTAT 5LS – Theory-Based Inference for a Population Proportion

Course Logistics & Upcoming Deadlines

  • Welcome to PSTAT 5LS
    • Current topic: Theory-Based Inference for p (beginning Slide 31)
    • Next topic: Inference for One Mean
  • Deadlines
    • HW 3: Monday July 14 @ 11:59 PM
    • HW 4: Friday July 18 @ 11:59 PM
    • Exam 1: Wednesday July 16 (during lecture)
  • Office Hours
    • Second OH this week: Monday @ 12 PM on Zoom

Exam 1 Information & Tips

  • Coverage
    • Slide Sets 1 – 5
    • HW 1 – 3
  • Format
    • 15\text{–}18 multiple-choice + 2\text{–}4 free-response
    • Write directly on the exam; formula sheet provided
  • What to bring
    • Pencil/pen, calculator, photo ID (UCSB or other)
  • Tips for success
    • Read each question carefully
    • Include context and units in explanations
    • Show complete work (except on MC questions)
    • Double-check that every part is answered

Transition from Hypothesis Testing to Estimation

  • Hypothesis testing: evaluates evidence against claims about population parameters
  • Estimation: uses sample statistics to approximate population parameters
    • Point estimate = single number → best guess of the parameter
    • For one proportion: \hat p estimates p
    • Natural sampling variability → need a margin of error (MOE) to create a range of plausible values

Confidence Intervals – Core Idea

  • Provide a range of plausible parameter values
  • Generic structure
    • \text{Point Estimate} \; \pm \; \text{Margin of Error}
  • Margin of Error
    • \text{MOE} = (\text{multiplier}) \times (\text{standard error})
    • Multiplier = critical value from a probability distribution (usually normal)
    • Standard error = estimated SD of the sampling distribution

Standard Error vs. Margin of Error (One Proportion)

  • Standard Error (SE)
    • Measures typical variability in \hat p
    • SE = \sqrt{\dfrac{p(1-p)}{n}} (unknown p will later be replaced by \hat p)
  • Margin of Error (MOE)
    • Adjusts SE for the desired confidence
    • MOE = z^* \times SE

Final CI Formula for One Proportion

  • Because p is unknown, plug in \hat p for both point estimate and SE
  • Confidence interval:
    \hat p \; \pm \; z^* \times \sqrt{\dfrac{\hat p(1-\hat p)}{n}}
  • Higher confidence ⇒ larger z^* ⇒ wider interval

Visualizing Multiple CIs

  • Simulations show:
    • Each sample (black dot) has its own \hat p
    • Blue lines = corresponding CIs; red line = true p
    • Approximately the chosen % of intervals contain the red line
  • Changing confidence level
    • 90 % → narrow
    • 95 % → medium
    • 99 % → wide

Choosing the Critical Value z^*

  • Standard normal cut-offs (two-sided)
    • 90 % ⇒ z^*=1.645
    • 95 % ⇒ z^*=1.960
    • 98 % ⇒ z^*=2.326
    • 99 % ⇒ z^*=2.576
  • Larger z^* gives a bigger MOE to ensure higher confidence

What Does “Confidence” Mean?

  • The confidence level applies to the method, not to a particular interval
  • Example: 95 % confidence ⇒ if we repeated sampling infinitely, ~95 % of computed intervals would include p
  • Each specific CI either contains p or it doesn’t (probability 0 or 1 after data are collected)

Conditions for Constructing a CI for p

  1. Independence
    • Random sampling/assignment OR sampling fraction <10\% of population
  2. Success–Failure (S–F) Condition
    • Need at least 10 expected successes and 10 expected failures
    • For CIs use \hat p: n\hat p \ge 10 and n(1-\hat p) \ge 10

Three-Step CI Procedure

  1. Check conditions
  2. Calculate CI with correct z^* and SE
  3. Interpret in context (mention population, parameter, and confidence level)

Worked Example 1 – Gallup Math Feelings

  • Survey: n = 5136 U.S. adults; \hat p = 0.47 reported only positive feelings about math
  • Conditions
    • Independence: random sample ⇒ satisfied
    • S–F: n\hat p = 5136(0.47) = 2413.92 ≥ 10, and n(1-\hat p)=2722.08 ≥ 10 ⇒ satisfied
  • 95 % CI
    • SE = \sqrt{\dfrac{0.47(0.53)}{5136}} = 0.00696425
    • MOE = 1.960 \times 0.00696425 = 0.0136
    • Interval: 0.47 \pm 0.0136 \; \Rightarrow \; (0.4564,\;0.4836)
  • Interpretation (proper wording)
    • “We are 95 % confident that between 45.64 % and 48.36 % of all U.S. adults have only positive feelings about math.”
    • Cannot state probability that p is in this specific interval — it either is or isn’t.

Worked Example 2 – Gen Z & Water Pollution

  • Parameter: p= proportion of all Gen Zers (ages 12–27) who say protecting waters from pollution is very important
  • Data: x=2096,\;n=2832,\;\hat p = 0.740113
  • Conditions
    • Independence: random sample ⇒ satisfied
    • S–F: n\hat p = 2832(0.740113)=2096\,(\ge10) and n(1-\hat p)=736\,(\ge10) ⇒ satisfied
  • 98 % CI (requires z^*=2.326)
    • SE = \sqrt{\dfrac{0.740113(1-0.740113)}{2832}} = 0.00918 (approx.)
    • MOE = 2.326 \times 0.00918 \approx 0.0214
    • CI: 0.7401 \pm 0.0214 \Rightarrow (0.7209,\;0.7593)
  • Decision Questions
    • 95 % CI would be narrower (smaller z^*)
    • Is more than 75 % supported? CI upper bound 0.7593, lower 0.7209 → interval includes 0.75, so cannot conclusively claim > 75 % at 98 % confidence.
  • Multiple-choice interpretation (Slide 47)
    • Correct statements: b, c, e

Using R – prop_test()

prop_test(x = 2096, n = 2832, conf.level = 0.98)
## 1-sample proportions test without continuity correction
## Z = 25.556, p-value < 2.2e-16
## 98 percent confidence interval:
##  0.7209409 0.7592851
## sample estimates:
##  p 
## 0.740113
  • Function automatically calculates CI, Z statistic, and p-value (default null p_0=0.5)
  • Ensures reproducibility and quick checks during analysis

Conceptual Take-Aways

  • MOE grows with both z^* and SE
  • CIs quantify uncertainty; they do not guarantee the parameter lies inside
  • Always verify assumptions before trusting a CI

Looking Forward

  • Having formalized theory-based CIs & tests for one proportion, the course will extend these concepts to means (Slide 51 & future lectures).