AP Statistics Unit 7: Confidence Intervals for Quantitative Means (One-Sample and Two-Sample)
Introduction to t-Distributions
Why you need something other than the normal distribution
When you build a confidence interval for a population mean, you’re trying to estimate an unknown parameter: the true mean of a population, written as \mu. In the “best case,” you would know the population standard deviation \sigma, which tells you how spread out individual observations are around \mu. If \sigma were known, the standardized statistic
z = \frac{\bar{x}-\mu}{\sigma/\sqrt{n}}
would follow a standard normal distribution (under the usual random sampling conditions), and you could build a confidence interval using a normal critical value.
In real life (and on AP Statistics), \sigma is almost never known. Instead, you estimate it with the sample standard deviation s. That seemingly small substitution creates extra variability: different random samples produce different s values, and that uncertainty must be reflected in your model for the standardized statistic.
That is exactly why the t-distribution exists.
What a t-distribution is
A t-distribution is a family of distributions used to model the standardized sample mean when \sigma is unknown and you use s instead. The corresponding standardized statistic is
t = \frac{\bar{x}-\mu}{s/\sqrt{n}}
This statistic follows a t-distribution (assuming the conditions for inference are met). Unlike the standard normal distribution, the t-distribution depends on degrees of freedom.
Degrees of freedom (df): what they mean here
For the one-sample mean setting, the degrees of freedom are
df = n - 1
Conceptually, degrees of freedom measure how much “independent information” is available to estimate variability. Because s is computed using \bar{x} (the sample mean), you lose one degree of freedom—hence n-1.
How t-distributions compare to the standard normal
All t-distributions are:
- Centered at 0 and symmetric like the standard normal.
- More spread out than the standard normal, especially in the tails.
Those heavier tails matter: they make confidence intervals wider (more cautious) to account for the extra uncertainty from estimating \sigma with s.
As n increases, s becomes a better estimate of \sigma, and the t-distribution approaches the standard normal distribution. In practice, for large df, t critical values are close to z critical values.
Critical values and notation
For confidence intervals, you use a critical value from a t-distribution. The notation
t^*
means “the t critical value that captures the middle C of the distribution,” where C is your confidence level (like 0.95 for a 95% interval), using the correct degrees of freedom.
For example, for a 95% confidence interval, t^* is chosen so that 95% of the t-distribution lies between -t^* and t^*.
Notation reference (common symbols you must read fluently)
| Quantity | Meaning | Typical notation |
|---|---|---|
| Population mean | Parameter you want | \mu |
| Sample mean | Statistic from data | \bar{x} |
| Population standard deviation | Usually unknown | \sigma |
| Sample standard deviation | Used when \sigma unknown | s |
| Sample size | Number of observations | n |
| Degrees of freedom (one-sample) | Determines t-shape | df = n-1 |
| Standard error of \bar{x} | Estimated SD of \bar{x} | SE = s/\sqrt{n} |
| t statistic | Standardized mean using s | t = (\bar{x}-\mu)/(s/\sqrt{n}) |
Example: seeing how df changes the critical value
Suppose you want a 95% confidence interval.
- With df = 9, t^* is noticeably larger than 1.96 (the z critical value).
- With df = 99, t^* is only slightly larger than 1.96.
The key takeaway is not memorizing specific numbers—it’s understanding the direction: smaller samples mean larger t^*, which means wider intervals.
Exam Focus
- Typical question patterns:
- Identify whether to use z or t (AP almost always expects t for means because \sigma is unknown).
- Determine degrees of freedom and select/interpret t^*.
- Compare the shape/spread of t-distributions as df changes.
- Common mistakes:
- Using a normal critical value (like 1.96) when \sigma is not given.
- Using df = n instead of df = n-1 for one-sample mean procedures.
- Thinking the t-distribution is skewed; it is symmetric—its difference is heavier tails.
Constructing a Confidence Interval for a Population Mean
What a confidence interval for a mean is (and what it is not)
A confidence interval for a population mean is a range of plausible values for \mu, based on sample data. The interval is built from:
- a point estimate (usually \bar{x}), and
- a margin of error that accounts for sampling variability.
A 95% confidence interval does not mean there is a 95% probability that \mu is in your computed interval. After you compute it, the interval either contains \mu or it doesn’t. The “95%” refers to the long-run success rate of the method: if you repeated the sampling process many times, about 95% of those intervals would capture \mu.
Why the t-interval works
Because you don’t know \sigma, you use s to estimate the spread of the sampling distribution of \bar{x}. The estimated standard deviation of \bar{x} is called the standard error:
SE = \frac{s}{\sqrt{n}}
Then you take your point estimate \bar{x} and go out t^* standard errors in both directions.
The one-sample t confidence interval formula
A one-sample t interval for the population mean \mu is
\bar{x} \pm t^*\left(\frac{s}{\sqrt{n}}\right)
Equivalently, you can write it as
\left(\bar{x} - t^*\frac{s}{\sqrt{n}},\ \bar{x} + t^*\frac{s}{\sqrt{n}}\right)
Where:
- \bar{x} is the sample mean.
- s is the sample standard deviation.
- n is the sample size.
- t^* comes from the t-distribution with df = n-1 at the chosen confidence level.
Conditions for using a one-sample t interval (what AP wants you to check)
You’re expected to justify inference with conditions. A common AP-friendly structure is:
- Random: Data come from a random sample or a randomized experiment.
- Normal (or approximately normal sampling distribution of \bar{x}):
- If the population is approximately normal, you’re fine.
- If n is large, the Central Limit Theorem supports that \bar{x} is approximately normal.
- If n is small, you should check that the sample distribution looks roughly symmetric with no strong outliers.
- Independence: Observations are independent. If sampling without replacement, a common check is the 10% condition:
n \le 0.10N
where N is the population size.
These conditions matter because the t-interval is derived assuming the t statistic behaves like a t-distribution. Strong skewness with a small sample or extreme outliers can break that approximation.
How confidence level, sample size, and variability affect the interval
It helps to predict what happens before calculating.
- Increasing the confidence level (like 90% to 95%) increases t^*, so the interval gets wider.
- Increasing sample size n decreases s/\sqrt{n}, so the interval gets narrower.
- Increasing variability s increases the standard error, so the interval gets wider.
This is the logic behind margin of error:
ME = t^*\left(\frac{s}{\sqrt{n}}\right)
Interpreting the interval in context (the wording AP expects)
A correct interpretation:
- Names the confidence level.
- Refers to the parameter \mu (not \bar{x}).
- Uses the context (what the mean represents).
Template you can adapt:
“We are C% confident that the true mean (context) for (population) is between (lower) and (upper).”
Worked example: one-sample t interval
A nutrition researcher takes a random sample of n = 25 energy bars of a certain brand and measures calories per bar. The sample mean is \bar{x} = 214.0 calories and the sample standard deviation is s = 10.0 calories. Construct a 95% confidence interval for the true mean calories per bar \mu.
Step 1: Identify the procedure and check conditions
- We want \mu and \sigma is not given, so we use a one-sample t interval.
- Random: stated random sample.
- Independence: reasonable if the sample is less than 10% of all bars produced (assume yes).
- Normal: n = 25 is moderately sized; if no strong skew/outliers are indicated, t procedures are typically considered reasonable.
Step 2: Degrees of freedom
df = n - 1 = 24
Step 3: Find the critical value
For 95% confidence with df = 24, use t^* from a t table or technology. (You do not need to memorize it; you must select it appropriately.)
Step 4: Compute the standard error
SE = \frac{s}{\sqrt{n}} = \frac{10.0}{\sqrt{25}} = 2.0
Step 5: Compute the interval
\bar{x} \pm t^*SE = 214.0 \pm t^*(2.0)
If t^* is approximately 2.064 for df = 24, then
ME = 2.064(2.0) = 4.128
So the interval is
\left(214.0 - 4.128,\ 214.0 + 4.128\right) = \left(209.872,\ 218.128\right)
Interpretation (in context)
“We are 95% confident that the true mean calories per bar for this brand is between about 209.9 and 218.1 calories.”
What can go wrong (common conceptual traps)
- Confusing standard deviation and standard error: s describes spread of individual data; s/\sqrt{n} describes spread of \bar{x} across samples.
- Ignoring outliers: A single extreme value can inflate s and distort \bar{x}, making the interval less meaningful.
- Thinking higher confidence means ‘more accurate’: Higher confidence increases reliability but widens the interval. You gain certainty by giving a wider range.
Exam Focus
- Typical question patterns:
- “Construct and interpret a 95% confidence interval for \mu” given \bar{x}, s, and n.
- “Check conditions” using a graph (dotplot/histogram/boxplot) and sampling description.
- Compute or interpret the margin of error and explain how to make it smaller.
- Common mistakes:
- Interpreting the interval as a probability statement about \mu.
- Using \sigma formulas or z critical values when only s is available.
- Forgetting to include context and the population/parameter in the interpretation.
Confidence Interval for a Difference of Two Means
The goal: comparing two population means
Often you’re not just estimating one mean—you’re comparing two groups. For example:
- Do students who sleep at least 8 hours have a higher mean test score than those who sleep less?
- Is the mean recovery time different for two medical treatments?
Here, you want the difference between two population means:
\mu_1 - \mu_2
A confidence interval for a difference of two means gives a plausible range of values for that parameter.
Two-sample setting: what data structure you need
To use a two-sample t interval (for independent groups), you need:
- Two independent samples (or two independent randomized groups).
- A quantitative variable measured for both groups.
Each group has its own sample statistics:
- Group 1: \bar{x}_1, s_1, n_1
- Group 2: \bar{x}_2, s_2, n_2
Your point estimate of \mu_1 - \mu_2 is
\bar{x}_1 - \bar{x}_2
Why the standard error is different for two means
A sample mean varies from sample to sample. When you subtract two sample means, the variability adds (in a variance sense), so the standard error for the difference is
SE = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}
This captures two sources of sampling variability—one from each group.
The two-sample t confidence interval (independent samples)
A two-sample t interval for \mu_1 - \mu_2 is
\left(\bar{x}_1 - \bar{x}_2\right) \pm t^*\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}
The main new practical issue is degrees of freedom. Many technologies compute an approximate df automatically (often associated with Welch’s method). On AP Statistics, using technology for t^* and df is acceptable; if you must approximate by hand, a common conservative choice sometimes taught is
df = \min(n_1 - 1,\ n_2 - 1)
If your course emphasizes calculator-based inference, you’ll typically report the interval produced by the two-sample t-interval procedure.
Conditions (what you must justify)
You typically justify conditions for each group plus independence between groups.
- Random: Each sample is random, or treatments were randomly assigned.
- Independence within each group: If sampling without replacement, check the 10% condition separately:
n_1 \le 0.10N_1
n_2 \le 0.10N_2
- Independent groups: The two samples/groups do not influence each other (no pairing, no repeated measures on the same individuals).
- Normal (within each group): Each group’s data are approximately normal, or each sample size is large enough for the sampling distribution of each mean to be approximately normal.
A major warning sign is outliers or strong skew in either group when sample sizes are small.
Paired data is a different procedure (important distinction)
Students often confuse “two groups” with “two-sample.” If measurements are naturally matched (same subjects before/after, twins, matched pairs), you do not use the two-sample t interval above. Instead, you compute differences for each pair and run a one-sample t interval on the differences. In this section, the focus is the independent two-sample interval, but you should always ask: “Are the observations paired?”
Interpreting a two-mean interval (including the ‘zero check’)
A correct interpretation names \mu_1 - \mu_2 and uses context:
“We are C% confident that the true difference in mean (context) between (population 1) and (population 2) is between (lower) and (upper), where the difference is defined as \mu_1 - \mu_2.”
A powerful insight comes from checking whether the interval contains 0:
- If 0 is inside the interval, then a difference of 0 is plausible, so there is not clear evidence of a difference (at a level consistent with that confidence).
- If 0 is not inside the interval, then 0 is not plausible, suggesting a real difference in means.
Be careful: this is a relationship to hypothesis testing ideas, but the interpretation is still about plausible values for \mu_1 - \mu_2.
Worked example: two-sample t interval for \mu_1 - \mu_2
A school compares mean weekly study time for students in two programs.
- Program 1: n_1 = 40, \bar{x}_1 = 12.4 hours, s_1 = 3.6 hours
- Program 2: n_2 = 35, \bar{x}_2 = 10.8 hours, s_2 = 4.1 hours
Construct a 95% confidence interval for \mu_1 - \mu_2, where \mu_1 is the mean study time for Program 1 and \mu_2 is the mean study time for Program 2.
Step 1: Choose procedure and check conditions
- We are estimating \mu_1 - \mu_2 with \sigma_1 and \sigma_2 unknown, so use a two-sample t interval.
- Random: assume each group is a random sample from its program (or students were randomly sampled).
- Independence within groups: reasonable if each sample is less than 10% of its program population.
- Independent groups: two different sets of students.
- Normal: both sample sizes are fairly large (35 and 40), so inference is typically reasonable.
Step 2: Compute the point estimate
\bar{x}_1 - \bar{x}_2 = 12.4 - 10.8 = 1.6
Step 3: Compute the standard error
SE = \sqrt{\frac{3.6^2}{40} + \frac{4.1^2}{35}}
Compute pieces:
\frac{3.6^2}{40} = \frac{12.96}{40} = 0.324
\frac{4.1^2}{35} = \frac{16.81}{35} \approx 0.4803
So
SE = \sqrt{0.324 + 0.4803} = \sqrt{0.8043} \approx 0.8968
Step 4: Find t^*
Using technology (or a conservative df approximation), df will be around the smaller of 39 and 34 if using \min(n_1-1,n_2-1), giving df = 34. For 95% confidence, t^* is a bit above 2.
Step 5: Build the interval
\left(\bar{x}_1 - \bar{x}_2\right) \pm t^*SE = 1.6 \pm t^*(0.8968)
If t^* is approximately 2.03, then
ME = 2.03(0.8968) \approx 1.82
So the interval is
\left(1.6 - 1.82,\ 1.6 + 1.82\right) = \left(-0.22,\ 3.42\right)
Interpretation
“We are 95% confident that the true mean difference in weekly study time (Program 1 minus Program 2) is between about -0.22 and 3.42 hours.”
Because 0 is inside the interval, a true difference of 0 hours is plausible based on these data.
What goes wrong most often in two-mean intervals
- Mixing up the order of subtraction: You must define the parameter as \mu_1 - \mu_2 and then compute \bar{x}_1 - \bar{x}_2 in the same order. If you swap, your interval changes sign.
- Treating paired data as independent: If the same individuals are measured twice, independence is violated; use a one-sample interval on differences instead.
- Forgetting the square root or squaring incorrectly in the standard error:
SE = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}
Students sometimes compute \frac{s_1}{n_1} instead of \frac{s_1^2}{n_1}, which can drastically shrink the interval incorrectly.
Exam Focus
- Typical question patterns:
- “Construct and interpret a confidence interval for \mu_1 - \mu_2” and state what it suggests about a difference.
- Determine whether the situation is independent two-sample or paired, and justify the correct method.
- Explain how changing n_1 or n_2 affects the margin of error.
- Common mistakes:
- Reporting an interval but interpreting it as “most sample means will fall here” instead of interpreting it as plausible values for \mu_1 - \mu_2.
- Using df = n_1 + n_2 - 2 automatically (that df corresponds to a pooled approach under equal-variance assumptions, which is not the default in many AP Stats courses).
- Failing to address conditions for both groups (especially checking for skew/outliers separately).