Stat 211: Elementary Inferential Statistics - Unit 6 Study Notes
Stat 211: Elementary Inferential Statistics
Unit 6: One-Sample Inference
Confidence Intervals for Proportions
Example: 2024 Births
In 2024, 3,622,673 babies were born in the US.
The proportion of premature births in 2024 was 10.4%.
Sampling Distribution
Definition: The sampling distribution for proportions specifies how sample proportions vary.
One individual proportion is derived from a sample of 100 babies.
Distribution Characteristics:
Normally distributed with mean:
Standard deviation determined as:
Specific normal distribution:
Sample Numbers:
# of Samples: 150, 100, 50, 50.
Recall: Empirical Rule
This rule allows for probability estimations from a normally distributed dataset:
0.15% | 2.35% | 13.5% | 34% | 34% | 68% of values fall within:
1 (68%)
2 (95%)
3 (99.7%) standard deviations from the mean.
Application: Empirical rules extend to estimating confidence intervals.
Our Goal
Objective: Estimate population parameters from a sample
Single values lack precision; for example, if a sample of 100 babies has 14 preemies, .
Computing standard deviation from just this sample is not ideal.
Standard Error (SE):
Definition: Estimated standard deviation of a sampling distribution
Significance: Estimates variability of sample statistics (p or x).
Formula for SE of Sample Proportions:
Applying the SE Calculation:
For $p = 0.14$:
Rationale for Confidence Intervals
We utilize sample statistics and account for uncertainty using the standard error.
Estimated sampling distribution is:
From sample:
True distribution:
Confidence Interval Interpretation:
If 95% of similar samples lay between 7% and 21%, then:
0.07 < p < 0.21 or (7%, 21%) indicates we are 95% confident the true population proportion is within this interval.
Rough Confidence Interval
A rough approximation for 95% CI is:
Critical z-Values: Used to define confidence levels, calculated from standard normal distribution:
Computation Examples:
Confidence Level 90% → Critical z-Value ≈ 1.645
Confidence Level 95% → Critical z-Value ≈ 1.960
Confidence Level 99% → Critical z-Value ≈ 2.576
More Precise Confidence Interval
Apply more precise definition:
Example for :
This is known as the one-proportion z-interval.
Misinterpretations in Confidence Intervals
Commonly misphrased interpretations include:
"14% of all babies born in 1998 were born prematurely."
"It is probably true that 14% of all babies born in 1998 were born prematurely."
Incorrectly stating that the true proportion is within a computed interval is misleading.
What to Say Instead:
Statements like "We are 95% confident the true population proportion lies within this interval" are more accurate.
One-Proportion z-interval
Conditions required to find a confidence interval include:
Previous formulas apply with ED(p) defined as:
Where z* specifies the number of SEs needed for C% of random samples.
Why Confidence?
Quantifying uncertainty is crucial; confidence intervals provide clarity beyond simple point estimates.
“Confidence” elucidates the long-run success of the method over multiple samples.
The value of confidence intervals is in their broader representativeness, not the accuracy of any single interval determination.
Example: Pineapple on Pizza Survey
From a 2025 survey, 1,000 adults revealed:
591 people who preferred pineapple on pizza.
Calculate the 95% confidence interval for the population proportion:
Using the confidence interval applet, indications for parameters include:
Number of Successes: 591
Total Sample Size: 1000
Confidence Level: 95%
Output from Calculator:
Confidence Interval: (0.5605, 0.6215)
Example: Boba Tea
Aim: Compute a 90% confidence interval for the proportion of college students who tried boba tea.
The interval is determined as:
Point Estimate ± z*•Standard Error.
Confidence vs. Precision
Margin of Error Explanation:
Tradeoff between confidence level and interval width:
More confidence → Wider interval.
Less confidence → Narrower interval.
Example intervals represent confidence vs. precision:
50% CI: (0.09, 0.11) yields smaller margin of error
100% CI: [0, 1] indicates maximal confidence with uncertainty.
Election Polling and Confidence Intervals
Polling samples report proportions ± margin of error:
E.g., Candidate A at 52% with ±3% ME gives CI: [49%, 55%].
Importance of sample size calculations for determining required sample sizes to achieve desired confidence levels and margin of error.
Choosing a Sample Size
Estimation methods differing based on prior estimates:
When p is known:
When p is not known (use conservative):
Always round up in sample size calculations.
Example Calculations:
For 3% margin of error with 95% confidence, determine appropriate sample size.
Confidence Intervals for Means
Example: 2024 Birthweights
In 2024, average birthweight was 3,318.9 grams.
Sampling Distribution:
Central Limit Theorem applies; distribution generated by sample means can be modeled as normal:
ar{x} ext{ drawn from } N( ext{Population Mean } ext{μ}, ext{ Population SD } σ)
Mean calculated:
Confidence Interval for Mean (σ Known)
Formula for computing confidence interval:
ext{CI} = ar{x} ± z^* rac{σ}{ ext{sqrt}(n)}.
Example: Birthweights for Confidence Interval
Most commonly encounter unknown population standard deviation (use sample standard deviation:.
t-Distribution Introduction:
Derived by William S. Gosset under pseudonym