Unit 5: Sampling Distributions

From Statistics to Sampling Distributions: the “Repeated Sampling” Idea

In AP Statistics, you’re almost always trying to learn about a population (every individual you care about) using data from a sample (the individuals you actually observe). The big problem is that different random samples from the same population won’t look exactly the same. As a result, any number you compute from a sample will naturally vary from sample to sample.

That variability is not something to ignore; it is the reason inference (confidence intervals and significance tests) works. Sampling distributions are how we model and quantify that sample-to-sample variability.

Parameters vs. statistics (and why the difference matters)

A parameter is a numerical value that describes a population. It is fixed (though usually unknown). Examples include:

  • Population proportion: the true fraction of the population with some characteristic, written as p.
  • Population mean: the true average of the population, written as \mu.
  • Population standard deviation: the true spread of the population, written as \sigma.

A statistic is a numerical value computed from a sample. It changes from sample to sample. Examples include:

  • Sample proportion: the fraction of the sample with the characteristic, written as \hat{p}.
  • Sample mean: the average of the sample, written as \bar{x}.

Inference is basically: use a statistic to estimate a parameter, while accounting for how much the statistic tends to vary.

What is a sampling distribution?

A sampling distribution is the distribution of values of a statistic across all possible random samples of a fixed size n from the same population.

It is crucial to keep straight what this is and is not:

  • It is not the distribution of the population.
  • It is not the distribution of the data in one sample.
  • It is the distribution of the statistic (like \hat{p} or \bar{x}) across repeated samples.

A helpful mental movie is:

  1. Imagine the population is fixed.
  2. You repeatedly take many random samples of the same size n.
  3. Each time, compute the statistic.
  4. Make a histogram of those statistic values.

That histogram approximates the sampling distribution.

Why sampling distributions matter: the bridge to probability

Once you understand the sampling distribution of a statistic, you can answer probability questions like:

  • “If the true population proportion is p = 0.60, how likely is it that a sample of size n = 100 produces \hat{p} \ge 0.70?”
  • “If the population mean is \mu = 50 with \sigma = 10, how likely is \bar{x} < 48 for n = 25?”

Those are probability questions about random samples. Sampling distributions give you the probability model needed to compute those probabilities.

Bias vs. variability (two different problems)

When you use a statistic to estimate a parameter, two things can go wrong:

  • Bias: the sampling distribution is centered away from the true parameter (systematically too high or too low).
  • Variability: the sampling distribution is too spread out (estimates jump around from sample to sample).

A key idea in this unit is that common statistics like \hat{p} and \bar{x} are unbiased under random sampling, meaning their sampling distributions are centered at the true parameter. In many AP Statistics contexts, the sampling distributions of proportions, means, and slopes are treated as unbiased: the set of all sample proportions is centered at the population proportion, the set of all sample means is centered at the population mean, and the set of all sample slopes is centered at the population slope.

Not all statistics are unbiased estimators. For example, the sample maximum is typically a biased estimator of the population maximum: for a given sample size, the set of all sample maxima tends to fall below the population maximum.

Notation reference (what you’re describing)

QuantityPopulation (parameter)Sample (statistic)Sampling distribution (of statistic)
Proportionp\hat{p}Mean and SD of \hat{p} across samples
Mean\mu\bar{x}Mean and SD of \bar{x} across samples
Standard deviation\sigmasUnit 5 focuses on \sigma for modeling \bar{x}

One more term you’ll see constantly:

  • Standard error: the standard deviation of a statistic’s sampling distribution (for example, the SD of \hat{p} across samples).
Exam Focus
  • Typical question patterns:
    • Distinguish clearly between a population distribution, a sample distribution, and a sampling distribution.
    • Interpret what a sampling distribution represents in context (repeated samples, same n).
    • Identify parameters vs. statistics from a scenario.
    • Explain what it means for an estimator to be biased or unbiased.
  • Common mistakes:
    • Treating the sampling distribution like it’s the distribution of individual data values.
    • Calling \hat{p} or \bar{x} a parameter (they are statistics).
    • Confusing “bias” (center) with “variability” (spread).
    • Forgetting that the sampling distribution depends on sample size n.

Normal Distribution Calculations and Standardization (z-scores)

The Normal distribution provides a valuable model for how many sample statistics vary under repeated random sampling from a population, especially when conditions justify an approximately Normal sampling distribution. Calculations involving Normal distributions are often made through z-scores, which measure how many standard deviations a value is from the mean.

Standardization (the big idea)

If a statistic T is Normal (or approximately Normal) with mean \mu_T and standard deviation \sigma_T, then the standardized value

Z = \frac{T - \mu_T}{\sigma_T}

follows the standard Normal distribution.

In this unit, T is most commonly \hat{p}, \bar{x}, \hat{p}_1 - \hat{p}_2, or \bar{x}_1 - \bar{x}_2.

TI-84 tools for Normal calculations

On the TI-84:

  • \text{normalcdf(lower, upper)} gives the area (probability) between two z-scores.
  • \text{invNorm(area)} gives the z-score with the given area (probability) to the left.

You can also work directly with raw scores (not z-scores) by providing the mean and standard deviation:

  • \text{normalcdf(lower, upper, mean, sd)}
  • \text{invNorm(area, mean, sd)}

Example 5.1 (Normal distribution with raw scores)

The life expectancy of a particular brand of lightbulb is roughly Normally distributed with mean 1500 hours and standard deviation 75 hours.

1) Probability a bulb lasts less than 1410 hours.

Compute the z-score:

z = \frac{1410 - 1500}{75} = -1.2

Then:

P(X < 1410) = P(Z < -1.2) \approx 0.1151

On the TI-84, you could use either:

  • \text{normalcdf(0, 1410, 1500, 75)} \approx 0.1151
  • \text{normalcdf(-10, -1.2)} \approx 0.1151

2) Probability a bulb lasts between 1563 and 1648 hours.

Compute the z-scores:

z_{1563} = \frac{1563 - 1500}{75} = 0.84

z_{1648} = \frac{1648 - 1500}{75} \approx 1.97

Then:

P(1563 < X < 1648) \approx 0.1762

(For example, \text{normalcdf(1563, 1648, 1500, 75)} \approx 0.1762 or \text{normalcdf(0.84, 1.97)} \approx 0.1760 depending on rounding.)

3) Probability a bulb lasts between 1416 and 1677 hours.

Compute the z-scores:

z_{1416} = \frac{1416 - 1500}{75} = -1.12

z_{1677} = \frac{1677 - 1500}{75} = 2.36

Then:

P(1416 < X < 1677) \approx 0.8595

What to show for full credit on AP-style Normal probability work

To receive full credit for probability calculations using probability distributions, show:

  1. Name of the distribution (for example, “Normal”)
  2. Parameters (for example, “\mu = 1500, \sigma = 75”)
  3. Boundary (for example, “1410”)
  4. Values of interest (for example, “

Standardizing common sampling-distribution statistics

If conditions justify a Normal model:

  • For a sample proportion:

Z = \frac{\hat{p} - p}{\sqrt{\frac{p(1-p)}{n}}}

  • For a sample mean:

Z = \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}}

Worked example: “unusual” sample proportion (z-score in context)

A manufacturer claims that only p = 0.03 of its light bulbs are defective. A store samples n = 400 bulbs and finds 20 defective bulbs.

Compute the observed sample proportion:

\hat{p} = \frac{20}{400} = 0.05

Check Large Counts using the claimed p:

np = 400(0.03) = 12

n(1-p) = 400(0.97) = 388

Compute standard error under the claim:

\sigma_{\hat{p}} = \sqrt{\frac{0.03(0.97)}{400}} \approx 0.00853

Standardize:

z = \frac{0.05 - 0.03}{0.00853} \approx 2.35

A z-score around 2.35 corresponds to a small upper-tail probability (around 0.009). That suggests \hat{p} = 0.05 would be unusual if p = 0.03 were true.

Common misconception: using \hat{p} inside the standard error for probability questions

For Unit 5 probability modeling, use the parameter in the standard error when the parameter is given.

  • If the question says “assume p = 0.40,” use p in \sqrt{p(1-p)/n}.
  • If the question says “assume \mu = 50 and \sigma = 10,” use \sigma/\sqrt{n}.

Later, in inference settings, parameters may be unknown and you may substitute estimates, but that is a different goal.

Exam Focus
  • Typical question patterns:
    • Convert raw scores to z-scores and compute Normal probabilities.
    • Use \text{normalcdf} and \text{invNorm} appropriately (sometimes directly with raw scores and the given mean and SD).
    • Compute a z-score for an observed \hat{p} or \bar{x} under a given p or \mu, then find a tail probability.
    • Use a sampling distribution model to set a cutoff for the “most extreme 5%” of outcomes.
  • Common mistakes:
    • Reporting a z-score as the final answer when the question asks for a probability.
    • Forgetting to state the distribution name and parameters.
    • Putting \hat{p} where p belongs (or \bar{x} where \mu belongs) when the problem states an assumed parameter.

The Sampling Distribution of the Sample Proportion \hat{p}

The sample proportion \hat{p} is one of the most common statistics in AP Statistics. It estimates the population proportion p and is used for success/failure (categorical) settings where interest is in the presence or absence of some attribute.

What is \hat{p}?

Suppose you take a random sample of size n from a population. Let X be the number of “successes” in the sample. Then:

\hat{p} = \frac{X}{n}

Even if p is fixed, X changes from sample to sample because the sample is random, so \hat{p} changes too.

Connecting to binomial facts (a useful derivation idea)

From binomial work, for the count of successes X in n trials:

\mu_X = np

\sigma_X = \sqrt{np(1-p)}

To move from counts to proportions, you divide by n. When every value in a distribution is divided by a constant, the mean and standard deviation are divided by the same constant. That leads to the proportion results:

\mu_{\hat{p}} = p

\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}

Center: the mean of \hat{p}

For random sampling, the sampling distribution of \hat{p} is centered at the true proportion:

\mu_{\hat{p}} = p

This is the formal statement that \hat{p} is an unbiased estimator of p.

Spread: the standard deviation (standard error) of \hat{p}

\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}

Larger n makes the standard error smaller. Also, p(1-p) is largest near p = 0.5, meaning proportions near 50% have the most sampling variability for a fixed n.

Shape: when is \hat{p} approximately Normal?

For probability calculations, you often want an approximate model for the sampling distribution’s shape.

A core AP condition is the Large Counts condition:

  • np \ge 10 and n(1-p) \ge 10

If that is met (and the sample is random with independence justified), then:

\hat{p} \approx N\left(p, \sqrt{\frac{p(1-p)}{n}}\right)

Independence and the 10% condition

These results assume observations are (approximately) independent.

  • Sampling with replacement supports independence.

  • Sampling without replacement from a population of size N is close enough to independent if:

  • 10% condition: n \le 0.10N

Worked example 1: probability using the Normal model for \hat{p}

A school district estimates that p = 0.40 of students walk to school. Suppose you take a random sample of n = 200 students.

Check conditions

  • Random sample: assumed.
  • 10% condition: if the district has at least N = 2000 students, then 200 \le 0.10N.
  • Large Counts:

np = 200(0.40) = 80

n(1-p) = 200(0.60) = 120

Mean and standard deviation

\mu_{\hat{p}} = 0.40

\sigma_{\hat{p}} = \sqrt{\frac{0.40(0.60)}{200}} \approx 0.03464

Probability

Find P(\hat{p} \ge 0.45).

z = \frac{0.45 - 0.40}{0.03464} \approx 1.443

So:

P(\hat{p} \ge 0.45) \approx P(Z \ge 1.443) \approx 0.074

Interpretation: If the true proportion is 0.40, only about 7% of random samples of 200 would produce a sample proportion of 0.45 or higher.

Worked example 2: when Normal is not appropriate

Suppose p = 0.02 and n = 50.

np = 50(0.02) = 1

This fails the Large Counts condition badly. The sampling distribution of \hat{p} will be strongly right-skewed because most samples will have 0 or 1 success. A Normal model can give misleading probabilities (including impossible negative proportions far into the left tail).

Example 5.4 (AP-style probability with \hat{p})

It is estimated that 80% of people with high math anxiety experience brain activity similar to that experienced under physical pain when anticipating doing a math problem. In a simple random sample of 110 people with high math anxiety, what is the probability that less than 75% experience the physical-pain brain activity?

Conditions:

  • Random sample: given.
  • Large Counts:

np = 110(0.80) = 88

n(1-p) = 110(0.20) = 22

  • 10% condition: the sample is less than 10% of all people with math anxiety (stated from context).

Model and calculation:

\mu_{\hat{p}} = 0.80

\sigma_{\hat{p}} = \sqrt{\frac{0.80(0.20)}{110}} \approx 0.0381

z = \frac{0.75 - 0.80}{0.0381} \approx -1.312

Then:

P(\hat{p} < 0.75) \approx 0.0948

(For example, \text{normalcdf(-1000, -1.312)} \approx 0.0948 or \text{normalcdf(-1000, 0.75, 0.80, 0.0381)} \approx 0.0947.)

Exam Focus
  • Typical question patterns:
    • Compute \mu_{\hat{p}} and \sigma_{\hat{p}} for given p and n, then find a probability about \hat{p}.
    • Check and state conditions (randomness, 10% condition, Large Counts) before using a Normal model.
    • Interpret probabilities in context (what it means about repeated samples).
  • Common mistakes:
    • Using \sqrt{\hat{p}(1-\hat{p})/n} when the problem gives a true p for probability modeling.
    • Forgetting the Large Counts condition or computing it using the wrong value.
    • Mixing up X (count of successes) with \hat{p} (proportion).

The Sampling Distribution of Differences in Sample Proportions

Sometimes the statistic of interest is the difference between two sample proportions, such as \hat{p}_1 - \hat{p}_2. Conceptually, you are dealing with one difference from the set of all possible differences obtained by subtracting sample proportions from two populations.

To judge the significance of one particular difference, you first determine how the differences vary among themselves. Two key facts guide the formulas:

  • The mean of a set of differences is the difference of the means.
  • The variance of a set of differences is the sum of the variances of the individual sets (when the samples are independent).

Center and spread

If population proportions are p_1 and p_2, with independent samples of sizes n_1 and n_2:

\mu_{\hat{p}_1 - \hat{p}_2} = p_1 - p_2

\sigma_{\hat{p}_1 - \hat{p}_2} = \sqrt{\frac{p_1(1-p_1)}{n_1} + \frac{p_2(1-p_2)}{n_2}}

Shape and conditions (when Normal is reasonable)

You typically model \hat{p}_1 - \hat{p}_2 as approximately Normal when:

  • The two samples are independent random samples (or from random assignment to independent groups).
  • Each sample satisfies the 10% condition relative to its population.
  • Large Counts are satisfied in each group:

n_1p_1 \ge 10

n_1(1-p_1) \ge 10

n_2p_2 \ge 10

n_2(1-p_2) \ge 10

Example 5.5 (difference in proportions probability)

In a study of how environment affects eating habits, scientists revamped one of two nearby fast-food restaurants. At the revamped restaurant, 25% left at least 100 calories of food on their plates. At the unrevamped restaurant, 19% left at least 100 calories. In a random sample of 110 customers at the revamped restaurant and an independent random sample of 120 customers at the unrevamped restaurant, what is the probability that the difference in percentages (revamped minus unrevamped) is more than 10%?

Check conditions:

  • Independent random samples: given.
  • 10% condition: each sample is less than 10% of all fast-food customers (from context).
  • Large Counts:

n_1p_1 = 110(0.25) = 27.5

n_1(1-p_1) = 110(0.75) = 82.5

n_2p_2 = 120(0.19) = 22.8

n_2(1-p_2) = 120(0.81) = 97.2

Model parameters:

\mu_{\hat{p}_1 - \hat{p}_2} = 0.25 - 0.19 = 0.06

\sigma_{\hat{p}_1 - \hat{p}_2} = \sqrt{\frac{0.25(0.75)}{110} + \frac{0.19(0.81)}{120}} \approx 0.0547

Compute:

z = \frac{0.10 - 0.06}{0.0547} \approx 0.731

So:

P(\hat{p}_1 - \hat{p}_2 > 0.10) \approx P(Z > 0.731) \approx 0.232

(For example, \text{normalcdf(0.731, 1000)} \approx 0.232 or \text{normalcdf(0.10, 1.0, 0.06, 0.0547)} \approx 0.232.)

Exam Focus
  • Typical question patterns:
    • Compute the mean and standard deviation of \hat{p}_1 - \hat{p}_2, then find a probability.
    • Explicitly check independence, the 10% condition for each sample, and Large Counts in each group.
    • Interpret the probability in context as a long-run statement about repeated sampling.
  • Common mistakes:
    • Forgetting that the variances add (not the standard deviations) when forming the SD of a difference.
    • Using Large Counts checks with the wrong proportions or mixing up which group is which.
    • Treating paired data as independent samples.

The Sampling Distribution of the Sample Mean \bar{x}

Where \hat{p} summarizes categorical data, \bar{x} summarizes quantitative data. You use \bar{x} to estimate the population mean \mu.

What is \bar{x}?

From a sample of size n with observations x_1, x_2, \dots, x_n, the sample mean is:

\bar{x} = \frac{x_1 + x_2 + \cdots + x_n}{n}

Like \hat{p}, \bar{x} is random because the sample is random.

Center: the mean of \bar{x}

For random sampling:

\mu_{\bar{x}} = \mu

So \bar{x} is an unbiased estimator of \mu.

Spread: the standard deviation (standard error) of \bar{x}

If the population standard deviation is \sigma, then:

\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

A variance-based way to remember where this comes from is:

  • If the population variance is \sigma^2, then the sum of n independent observations has variance n\sigma^2.
  • Dividing that sum by n to make a mean divides the variance by n^2.

That leaves variance \sigma^2/n and SD \sigma/\sqrt{n}.

This formula also highlights diminishing returns: to cut standard error in half, you need to multiply n by about 4.

Shape: when is \bar{x} approximately Normal?

There are two common routes to a Normal sampling distribution for \bar{x}:

  1. If the population is Normal, then \bar{x} is Normal for any n.
  2. If the population is not Normal, then for large enough n the sampling distribution of \bar{x} becomes approximately Normal by the Central Limit Theorem (CLT).

Worked example 1: probability about \bar{x} when population is Normal

Suppose website load times are Normally distributed with mean \mu = 8.0 minutes and standard deviation \sigma = 1.5 minutes. You randomly sample n = 36 load times and compute \bar{x}.

Because the population is Normal, \bar{x} is Normal.

\mu_{\bar{x}} = 8.0

\sigma_{\bar{x}} = \frac{1.5}{\sqrt{36}} = 0.25

Find P(\bar{x} > 8.4):

z = \frac{8.4 - 8.0}{0.25} = 1.6

P(\bar{x} > 8.4) = P(Z > 1.6) \approx 0.055

Example 5.6 (mean and SD of sample means)

One particular energy drink has an average of 200 mg of caffeine with a standard deviation of 10 mg. A store sells boxes of six bottles each. What is the mean and standard deviation of the average milligrams of caffeine consumers should expect from the six bottles in each box?

Assuming random sampling and that n = 6 is less than 10% of all such bottles:

\mu_{\bar{x}} = 200

\sigma_{\bar{x}} = \frac{10}{\sqrt{6}} \approx 4.08

Interpretation: For all random samples of size 6 from this population, the sample mean caffeine content will have a mean of 200 mg and will typically vary by about 4.08 mg from 200 mg.

Worked example 2: what changes with n (and what does not)

A common misconception is that a bigger sample makes the population more Normal or makes the data less variable. The population distribution stays the same, and the variability of individual observations stays the same.

What changes is the variability of the sample mean. If \sigma = 10:

  • For n = 25:

\sigma_{\bar{x}} = \frac{10}{\sqrt{25}} = 2

  • For n = 100:

\sigma_{\bar{x}} = \frac{10}{\sqrt{100}} = 1

So averages become more tightly clustered around \mu as n increases.

Exam Focus
  • Typical question patterns:
    • Given \mu, \sigma, and n, compute \mu_{\bar{x}} and \sigma_{\bar{x}} and then a probability involving \bar{x}.
    • Decide whether the sampling distribution of \bar{x} is Normal (population Normal vs. CLT).
    • Compare the effect of different sample sizes on the spread of \bar{x}.
  • Common mistakes:
    • Using \sigma/n instead of \sigma/\sqrt{n}.
    • Claiming \bar{x} is Normal just because the population is “sort of” mound-shaped without addressing skew/outliers or sample size.
    • Confusing the population SD \sigma with the SD of \bar{x}.

The Central Limit Theorem (CLT): Why Averages Tend to Look Normal

The Central Limit Theorem is one of the most important ideas in statistics because it explains why Normal models show up so often, even when the original data are not Normal.

What the CLT says (conceptually and formally)

Start with a population with mean \mu, standard deviation \sigma, and **any shape**. Pick a sufficiently large sample size n (a common AP rule-of-thumb is n \ge 30, but context matters). Consider all samples of size n and compute the mean of each sample.

For sufficiently large n:

  1. The set of all sample means is approximately Normally distributed (equivalently, the sampling distribution of \bar{x} is approximately Normal).
  2. The mean of the set of sample means equals the population mean:

\mu_{\bar{x}} = \mu

  1. The standard deviation of the set of sample means is approximately the population SD divided by the square root of the sample size:

\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

A helpful intuition is “noise cancellation by averaging”: random deviations above and below the mean tend to cancel when you average many observations.

Six key ideas to keep in mind

  • Averages vary less than individual values.
  • Averages based on larger samples vary less than averages based on smaller samples.
  • The CLT states that when the sample size is sufficiently large, the sampling distribution of the mean will be approximately Normal.
  • As n increases, the sample distribution (the distribution of the observed data values in one sample) tends to resemble the population distribution more closely.
  • As n increases, the sampling distribution of \bar{x} becomes closer to a Normal distribution.
  • If the original population is Normal, then the sampling distribution of \bar{x} is Normal no matter what the sample size n is.

What the CLT does and does not mean

  • The CLT is about the sampling distribution of \bar{x}, not the distribution of the raw data.
  • The CLT does not say the population becomes Normal.
  • The CLT does not guarantee Normality for small n, especially with strong skewness or outliers.

Conditions you still need: randomness and (approximate) independence

The CLT relies on independence (or close enough), which is why AP Statistics emphasizes:

  • a random sample (or random assignment in an experiment)
  • the 10% condition when sampling without replacement

CLT and sums (another common form)

If you look at the sum:

S = x_1 + x_2 + \cdots + x_n

then for large n, S is approximately Normal with:

\mu_S = n\mu

\sigma_S = \sigma\sqrt{n}

Since \bar{x} = S/n, that leads back to \mu_{\bar{x}} = \mu and \sigma_{\bar{x}} = \sigma/\sqrt{n}.

Example 5.2 (CLT with a large sample)

The naked mole rat has a life expectancy of 21 years with a standard deviation of 3 years.

1) In a random sample of 40 such rats, what is the probability that the mean life expectancy is between 20 and 22 years?

Assume the sample is random and less than 10% of the population. With n = 40, a CLT-based Normal model for \bar{x} is typically reasonable.

\mu_{\bar{x}} = 21

\sigma_{\bar{x}} = \frac{3}{\sqrt{40}} \approx 0.474

Compute z-scores:

z_{20} = \frac{20 - 21}{0.474} \approx -2.110

z_{22} = \frac{22 - 21}{0.474} \approx 2.110

So:

P(20 < \bar{x} < 22) \approx 0.965

(For example, \text{normalcdf(-2.110, 2.110)} \approx 0.965 or \text{normalcdf(20, 22, 21, 0.474)} \approx 0.965.)

2) The mean life expectancy is at least how many years with a corresponding probability of 0.90?

A probability of 0.90 to the right means 0.10 to the left, so use:

z = \text{invNorm}(0.10) \approx -1.282

Then the correct cutoff uses the standard error \sigma_{\bar{x}}:

x = 21 + (-1.282)(0.474) \approx 20.39

A very common mistake is to multiply by the population SD 3 instead of the standard error. If you (incorrectly) did that, you would compute:

21 - 1.282(3) \approx 17.15

That value is included here because it is a classic “wrong SD” error to watch for.

Worked example: CLT with a skewed population

Suppose customer spending per visit is strongly right-skewed, with mean \mu = 12.50 and standard deviation \sigma = 9.00. Take a random sample of n = 64 customers and compute \bar{x}.

Even though individual spending is skewed, with n = 64 the CLT often justifies a Normal model for \bar{x} in AP settings (assuming random sampling and independence).

\mu_{\bar{x}} = 12.50

\sigma_{\bar{x}} = \frac{9.00}{\sqrt{64}} = 1.125

What goes wrong: outliers and dependence

Two classic ways CLT reasoning can fail:

  1. Extreme outliers: rare, enormous values can dominate averages, requiring very large n for a Normal approximation.
  2. Dependence: correlated observations reduce effective information and can make the sampling distribution wider than formulas predict.
Exam Focus
  • Typical question patterns:
    • Decide whether it is reasonable to treat \bar{x} as approximately Normal using the CLT, given the population shape and n.
    • Use CLT-based Normal modeling to compute a probability for \bar{x}.
    • Explain why increasing n makes the sampling distribution of \bar{x} less variable and (often) more nearly Normal.
  • Common mistakes:
    • Claiming the data become Normal as n increases (it’s the sampling distribution of the statistic that becomes Normal).
    • Using the CLT without checking independence (randomness and 10% condition).
    • Assuming n = 30 is always enough, even with extreme skew/outliers.
    • Using \sigma where \sigma/\sqrt{n} belongs (as highlighted in Example 5.2).

The Sampling Distribution of Differences in Sample Means

Often the statistic of interest is the difference between two sample means, \bar{x}_1 - \bar{x}_2. To judge how “significant” one observed difference is, you need to understand how differences vary across repeated sampling.

A key fact is that (for independent samples) the variance of a difference equals the sum of the variances.

Center and spread

For independent samples from populations with means \mu_1 and \mu_2, standard deviations \sigma_1 and \sigma_2, and sample sizes n_1 and n_2:

\mu_{\bar{x}_1 - \bar{x}_2} = \mu_1 - \mu_2

\sigma_{\bar{x}_1 - \bar{x}_2} = \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}

Shape and conditions

A Normal model for \bar{x}_1 - \bar{x}_2 is justified when:

  • Samples are independent and random (or result from random assignment to independent groups).
  • Each sample meets the 10% condition if sampling without replacement.
  • Each group’s sampling distribution of the mean is Normal or approximately Normal (population Normal or large n via CLT).

Example 5.7 (difference in means probability)

It is estimated that 40-year-old men contribute an average of 65 genetic mutations to their new children, whereas 20-year-old men contribute an average of 25. Assuming standard deviations of 15 and 5 mutations for the 40- and 20-year-olds, what is the probability that the mean number of mutations in a random sample of thirty-five 40-year-old new fathers is between 35 and 45 more than the mean number in a random sample of forty 20-year-old new fathers?

Assume independent random samples, each less than 10% of their respective populations, and note both sample sizes are over 30, so a CLT-based Normal model is reasonable.

Center:

\mu_{\bar{x}_1 - \bar{x}_2} = 65 - 25 = 40

Spread:

\sigma_{\bar{x}_1 - \bar{x}_2} = \sqrt{\frac{15^2}{35} + \frac{5^2}{40}} \approx 2.656

Compute z-scores for 35 and 45:

z_{35} = \frac{35 - 40}{2.656} \approx -1.883

z_{45} = \frac{45 - 40}{2.656} \approx 1.883

Then:

P(35 < \bar{x}_1 - \bar{x}_2 < 45) \approx 0.940

(For example, \text{normalcdf(-1.883, 1.883)} \approx 0.940 or \text{normalcdf(35, 45, 40, 2.656)} \approx 0.940.)

Exam Focus
  • Typical question patterns:
    • Compute the mean and standard deviation of \bar{x}_1 - \bar{x}_2, then find a probability.
    • Justify a Normal model using population Normality or CLT for each group, plus independence checks.
  • Common mistakes:
    • Adding standard deviations instead of adding variances.
    • Forgetting that independence is required for the “variances add” rule.
    • Using \sigma_1/n_1 instead of \sigma_1/\sqrt{n_1} inside the SD formula.

Biased and Unbiased Estimators (and How to Choose Among Them)

Bias means the sampling distribution is not centered on the population parameter. A statistic used to estimate a population parameter is unbiased if the mean of the sampling distribution of the statistic equals the true value of the parameter being estimated.

In many AP Statistics settings, sampling distributions of proportions, means, and slopes are treated as unbiased. But other statistics can be biased. The sample maximum is a classic illustration: for a given sample size, the distribution of sample maxima tends to sit below the population maximum.

Example 5.3 (evaluating estimators)

Five new estimators are being evaluated for quality control in manufacturing professional baseballs of a given weight. Each estimator is tested every day for a month on samples of sizes n = 10, n = 20, and n = 40. The baseballs actually produced that month had a consistent mean weight of 146 grams.

1) Which estimators appear to be unbiased estimators of the population parameter?

Estimators B, C, and D appear to be unbiased because they appear to have means equal to the population mean of 146.

2) Which estimator exhibits the lowest variability for n = 40?

For n = 40, estimator A exhibits the lowest variability, with a range of only 2 grams compared to the other ranges of 6 grams, 4 grams, 4 grams, and 4 grams.

3) Which is the best estimator if the selected estimator will eventually be used with a sample of size n = 100?

Choose estimator D because you should choose an unbiased estimator with low variability. From part (1), B, C, and D are unbiased. Looking at variability as n increases, D shows tighter clustering around 146 than B. While C looks better than D for n = 40, the estimator will be used with n = 100, and D is clearly converging as sample size increases while C remains about the same. Therefore, choose D.

Exam Focus
  • Typical question patterns:
    • Determine whether an estimator is biased by comparing the center of its sampling distribution to the true parameter.
    • Choose a “best” estimator by balancing low bias (ideally unbiased) and low variability, especially as n changes.
  • Common mistakes:
    • Picking an estimator solely because it has the smallest variability even if it is clearly biased.
    • Assuming increasing n eliminates bias (it reduces variability, not bias from the method/statistic).

How Sample Size Drives Precision: Standard Error, Variability, and Planning

A major theme of sampling distributions is that larger samples give more stable statistics. This unit makes that idea precise through standard error formulas.

What “more precise” means statistically

When people say a larger sample is “more accurate,” they often mix:

  • Less variable: repeated samples give estimates closer together.
  • Less biased: estimates are centered at the truth.

Increasing n reduces variability (standard error). It does not fix bias caused by a bad sampling method.

A huge convenience sample can be very biased, while a smaller well-designed random sample can be unbiased and scientifically useful.

How standard error changes with n

For proportions:

\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}

For means:

\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

In both cases, standard error shrinks like 1/\sqrt{n}. That square-root relationship is why reducing uncertainty by a factor of 2 requires multiplying sample size by about 4.

Planning idea: choosing n to target a standard error

To target a standard deviation of \bar{x} at most k:

\frac{\sigma}{\sqrt{n}} \le k

which gives:

n \ge \left(\frac{\sigma}{k}\right)^2

For proportions, to have:

\sqrt{\frac{p(1-p)}{n}} \le k

you need:

n \ge \frac{p(1-p)}{k^2}

Because p is often unknown in planning, a conservative strategy is to use p = 0.5 because p(1-p) is largest at 0.25, producing the largest required sample size for a given target standard error.

Worked example: comparing standard errors

A survey organization is deciding between n = 400 and n = 1600 for estimating a proportion (assume p is around 0.5).

For n = 400:

\sigma_{\hat{p}} \approx \sqrt{\frac{0.25}{400}} = 0.025

For n = 1600:

\sigma_{\hat{p}} \approx \sqrt{\frac{0.25}{1600}} = 0.0125

Increasing sample size by a factor of 4 cuts the standard error in half.

What goes wrong: “bigger n fixes everything”

Larger samples help with random error, not systematic error. Common sources of systematic error include:

  • undercoverage (missing part of the population)
  • voluntary response bias
  • poorly worded questions
  • nonresponse

A massive biased sample can confidently estimate the wrong value.

Exam Focus
  • Typical question patterns:
    • Compare sampling variability for two different sample sizes using standard error formulas.
    • Explain, in context, why a larger sample gives more consistent estimates.
    • Solve for n needed to achieve a desired standard error (when \sigma or an assumed p is given).
  • Common mistakes:
    • Saying a larger sample reduces bias (it reduces variability, not bias).
    • Forgetting the square root and thinking doubling n halves the standard error.
    • Using p = 0.5 without explaining why (maximizes p(1-p) for conservative planning).

Seeing Sampling Distributions in Action: Simulation and Long-Run Behavior

Sampling distributions are defined using “all possible samples,” which you can’t literally list in realistic situations. Two tools help you understand them anyway:

  1. Theoretical results (formulas and Normal approximations for statistics like \hat{p} and \bar{x}).
  2. Simulation (using technology or repeated randomization to approximate what would happen in many samples).

The Normal distribution can handle sampling distributions of the statistics we are most interested in, namely the sample proportion and sample mean. If other statistics arise, simulation can provide a rough idea of their sampling distributions.

Why simulation is legitimate

A sampling distribution is about long-run behavior under a chance process. Simulation imitates that chance process many times and records the statistic each time. With enough repetitions, the histogram of simulated statistics approximates the true sampling distribution.

Simulation:

  • reinforces that the statistic is random
  • shows the effect of n visually
  • can be used when theoretical conditions (like Normal approximations) aren’t met

Example: simulating a sampling distribution for \hat{p}

Suppose p = 0.30 of voters favor a ballot measure. You plan to poll n = 50 voters.

A simulation plan:

  • Repeat many times (say 1000 trials):
    • Generate 50 independent “voters,” each with probability 0.30 of “favor.”
    • Compute \hat{p}.
  • Plot the 1000 values of \hat{p}.

Predictions to check:

  • Center should be near 0.30.
  • Spread should be close to:

\sqrt{\frac{0.30(0.70)}{50}}

  • Shape should be roughly Normal if Large Counts holds:

np = 50(0.30) = 15

n(1-p) = 50(0.70) = 35

Example: simulating when Normal approximation fails

If p is very small and n is modest, the Large Counts condition fails. The sampling distribution of \hat{p} may be skewed and clumped at values like 0, 1/n, 2/n. Simulation gives a more honest picture than forcing a Normal curve.

Example: simulating sampling distributions for “nonstandard” statistics

A study is made of the number of dreams high school students remember having every night. The median number is 3.41 with a variance of 1.46 and a minimum of 0. Taking a large number of random samples of 15 students, compute the median, variance, and minimum for each sample and graph the simulated sampling distributions.

A typical simulated result is:

  • the simulated sampling distribution of the medians is roughly bell-shaped
  • the simulated sampling distribution of the variances is skewed right
  • the simulated sampling distribution of the minimums is very roughly bell-shaped

Connecting simulation to the formulas

Before running a simulation, predict what it should show:

  • Mean of simulated \hat{p} values should be close to p.
  • SD of simulated \hat{p} values should be close to \sqrt{p(1-p)/n}.

If results don’t match, suspect:

  • the random process wasn’t simulated correctly
  • independence is violated
  • the scenario is not truly repeated sampling from the same population
Exam Focus
  • Typical question patterns:
    • Describe how you could simulate a sampling distribution for \hat{p} or \bar{x}.
    • Interpret a dotplot or histogram of simulated statistics (center, spread, shape) and connect it to p or \mu.
    • Decide whether simulation results support a particular claim about a parameter.
  • Common mistakes:
    • Simulating the distribution of individual outcomes rather than the statistic across repeated samples.
    • Forgetting to keep n fixed across repetitions.
    • Treating simulated results as exact rather than approximate (simulation has random variation too).

Pulling It Together: A Unified View of Sampling Distributions

By the end of Unit 5, you should see \hat{p} and \bar{x} (and their two-sample differences) as versions of the same big idea:

  • They are statistics computed from random samples.
  • They have sampling distributions with predictable centers (often unbiasedness).
  • They have standard errors that shrink as n grows.
  • They are often approximately Normal under checkable conditions.

Unified “center-spread-shape” framework

Whenever you’re asked about a sampling distribution, organize your thinking with:

1) Center (mean of the statistic)

  • For \hat{p}:

\mu_{\hat{p}} = p

  • For \bar{x}:

\mu_{\bar{x}} = \mu

2) Spread (standard deviation of the statistic, also called standard error)

  • For \hat{p}:

\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}

  • For \bar{x}:

\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}

3) Shape (Normal or approximately Normal?)

  • For \hat{p}: check Large Counts and independence.
  • For \bar{x}: population Normal or CLT with sufficiently large n, plus independence.

Why Unit 5 is the engine behind later units

In later units, you build confidence intervals and perform significance tests. Conceptually, those procedures rely on the same structure:

  • Assume a parameter value (or estimate it).
  • Model the statistic’s sampling distribution.
  • Measure how far the observed statistic is from what the model expects.
  • Convert that “distance” into a probability or a margin of error.

Unit 5 is where the sampling distribution models are developed and justified.

What goes wrong most often: mixing up distributions

A recurring AP Statistics challenge is keeping three distributions straight:

  • Population distribution: distribution of all individuals in the population.
  • Sample distribution: distribution of the observed data values in one sample.
  • Sampling distribution: distribution of a statistic across many samples.

A fast self-check: if you’re describing a distribution of many values of \hat{p}, many values of \bar{x}, or many values of differences like \hat{p}_1 - \hat{p}_2, you’re in the sampling distribution world.

Exam Focus
  • Typical question patterns:
    • Multi-part questions that require conditions, then a model, then a probability, then an interpretation.
    • Conceptual free-response prompts asking you to explain why larger n reduces variability and why Normal models can be used.
    • Identify which distribution (population, sample, sampling) is being shown in a graph.
  • Common mistakes:
    • Stating formulas without context or without checking conditions.
    • Using the wrong standard deviation (population vs. sampling distribution).
    • Interpreting a probability about \bar{x} or \hat{p} as if it were about an individual observation.
    • Confusing the sampling distribution of a statistic with the distribution of the population or the distribution of one sample.