APS Section 7.3: Sample Means Study Guide

Learning Targets for Section 7.3: Sample Means

  • Calculate Mean and Standard Deviation: Be able to calculate ̄{x} and interpret the standard deviation of the sampling distribution of a sample mean.
  • Examine Shape Influences: Explain how the shape of the sampling distribution of ̄{x} is affected by the population distribution shape and the sample size.
  • Difference in Sample Means: Calculate the mean and standard deviation of the sampling distribution of a difference in sample means ̄{x}_1 - ̄{x}_2 and interpret that standard deviation.
  • Normality Assessment: Determine if the sampling distribution of ̄{x}_1 - ̄{x}_2 is approximately Normal.
  • Probability Calculations: If appropriate, use a Normal distribution to calculate probabilities involving ̄{x} or ̄{x}_1 - ̄{x}_2.

Introduction to Quantitative Variables and Sample Statistics

  • Categorical vs. Quantitative Variables:   - Sample proportions (p^\hat{p}) typically arise when investigating categorical variables (e.g., "What proportion of adults watched a specific show?").   - Quantitative variables (e.g., household income, blood pressure, lifetime of car brake pads) require different statistics such as the median, mean, or standard deviation.
  • The Sample Mean: The sample mean ̄{x} is the most common statistic computed from quantitative data.

The Sampling Distribution of ̄{x}

  • Definition: The sampling distribution of the sample mean ̄{x} describes the distribution of values taken by the sample mean ̄{x} in all possible samples of the same size from the same population.
  • Activity Case Study: "Penny for Your Thoughts":   - Activity conducted by Mrs. Gallas’s class using a population of pennies.   - Students produced dotplots for samples of size n=5n = 5 and samples of size n=20n = 20.
  • Comparison of Sampling Distributions from the Activity:   - Shape: The distribution is slightly skewed to the left when n=5n = 5, but becomes roughly symmetric when n=20n = 20.   - Center: The distribution is centered at approximately the year 20022002 for both sample sizes (̑{x} \approx 2002).   - Variability: The distribution of ̄{x} is about half as variable when using samples of size n=20n = 20 (σ_{̄{x}} \approx 2.6) compared to samples of size n=5n = 5 (σ_{̄{x}} \approx 5.2).

Mean and Standard Deviation of the Sampling Distribution of ̄{x}

  • General Principle: Suppose ̄{x} is the mean of an SRS of size nn drawn from a large population with mean μμ and standard deviation σσ.
  • Mean of the Sampling Distribution (μ_{̄{x}}):   - μ_{̄{x}} = μ   - This indicates that the sample mean ̄{x} is an unbiased estimator of the population mean μμ.
  • Standard Deviation of the Sampling Distribution (σ_{̄{x}}{}):   - σ_{̄{x}} = \frac{σ}{\sqrt{n}}   - This formula is approximately correct as long as the 10% condition is satisfied: n<0.10Nn < 0.10N.   - Interpretation: The value σ_{̄{x}} measures the typical distance between a sample mean and the population mean.

Behavior of ̄{x} in Repeated Samples

  • Variability Factors: The variability of ̄{x} depends on both the variability of the population (σσ) and the sample size (nn).   - Populations with higher variability result in more variable values of ̄{x}.   - Larger samples result in less variable values of ̄{x}.   - The Inverse Square Root Relationship: Specifically, multiplying the sample size by 44 cuts the standard deviation of its sampling distribution in half.
  • Sampling With vs. Without Replacement:   - With replacement: The standard deviation is exactly σn\frac{σ}{\sqrt{n}}.   - Without replacement: Observations are not independent, making the actual standard deviation smaller than the formula suggests. However, if the sample size is less than 10%10\% of the population size, the formula is considered nearly correct.   - Note: Larger samples provide more information. If the sample size exceeds 10%10\% of the population size, a "finite population correction" is required (though avoided in this specific text).
  • Shape Consistency: These facts regarding the mean and standard deviation of ̄{x} hold true regardless of the shape of the population distribution.

AP® Exam Tip: Proper Notation

  • Correct notation is vital on the AP Exam. Different symbols have specific meanings: p, ̄{x}, n, μ, μ_{̄{x}}, σ, σ_{̑{x}}, \hat{p}.
  • Using incorrect notation can cause a loss of credit. If unsure, it is better to avoid the notation than to use it incorrectly.

Example Problem: Movie Attendance Statistics

  • Problem Context: A large high school has a population mean of 19.319.3 movies viewed in the last year with a standard deviation of 15.815.8 movies.
  • Sample Details: An SRS of n=100n = 100 students is taken.
  • Questions and Solutions:   - (a) Identify the mean of the sampling distribution:     - Solution: \u03BC_{̄{x}} = μ = 19.3\text{ movies}.   - (b) Calculate and interpret the standard deviation; verify the 10% condition:     - Condition Verification: Assume n=100n = 100 is less than 10%10\% of the total students at the large high school.     - Calculation: \u03C3_{̄{x}} = \frac{15.8}{\sqrt{100}} = \frac{15.8}{10} = 1.58\text{ movies}.     - Interpretation: In SRSs of size 100100, the sample mean number of movies viewed will typically vary by about 1.58 movies1.58\text{ movies} from the true population mean of 19.3 movies19.3\text{ movies}.

Mathematical Derivation of Mean and Standard Deviation (Think About It)

  • The Independent Random Variable Model: Let measurements in a sample of size nn be X1,X2,,XnX_1, X_2, \dots, X_n. For a large population, these are independent random variables with mean μμ and standard deviation σσ.
  • Definition of Sample Mean: 0˘304x=(X1+X2++Xn)n\u0304{x} = \frac{(X_1 + X_2 + \dots + X_n)}{n}.
  • Let TT be the sum: T=X1+X2++XnT = X_1 + X_2 + \dots + X_n.
  • Mean Calculation for TT and ̄{x}:   - 0˘3BCT=μ+μ++μ=nμ\u03BC_T = μ + μ + \dots + μ = nμ   - Since 0˘304x=(1n)T\u0304{x} = (\frac{1}{n})T, then \u03BC_{̄{x}} = (\frac{1}{n})μ_T = (\frac{1}{n})(nμ) = μ
  • Standard Deviation Calculation for TT and ̄{x}:   - Using variance rules: 0˘3C3T2=σ2+σ2++σ2=nσ2\u03C3_T^2 = σ^2 + σ^2 + \dots + σ^2 = nσ^2   - Standard deviation of the sum: 0˘3C3T=nσ2=σn\u03C3_T = \sqrt{nσ^2} = σ\sqrt{n}   - Since 0˘304x=(1n)T\u0304{x} = (\frac{1}{n})T, then \u03C3_{̄{x}} = (\frac{1}{n})σ_T = \frac{σ\sqrt{n}}{n} = \frac{σ}{\sqrt{n}}

Section 7.3 Practice and Review Questions

  • Exercise 54: Sample Size and Standard Deviation   - Decreasing sample size from 750750 to 375375 (halving the size) would multiply the standard deviation by:     - (a) 22     - (b) 2\sqrt{2}     - (c) 1/21/2     - (d) 1/21/\sqrt{2}     - (e) none of these.
  • **Exercise 55: Normality of p^\hat{p}   - The sampling distribution of p^\hat{p} is approximately Normal because:     - (a) there are at least 75007500 Division I college athletes.     - (b) np=225np = 225 and n(1p)=525n(1-p) = 525 are both at least 1010.     - (c) a random sample was chosen.     - (d) the responses are quantitative.     - (e) it always has this shape.
  • Exercise 56: Party Affiliation Probabilities   - In a district where 55%55\% of voters are Democrats, which expression represents the probability of getting less than 50%50\% Democrats in a sample of size 100100?   - Target probability: P(Z<0.500.550.55(0.45)100)P(Z < \frac{0.50 - 0.55}{\sqrt{\frac{0.55(0.45)}{100}}}).
  • Exercise 57: Music Sharing Table (Recycle and Review)   - Data provided: 29%29\% download music, 21%21\% share music, 12%12\% do both.   - (a) Create a two-way table.   - (b) What percent of users neither download nor share?   - (c) Given a user downloads music, what is the probability they also share?
  • Exercise 58: Whole Grains Observational Study   - Context: People eating 33 servings of whole grains daily have low risk of heart disease (20%-20\%), stroke, or cancer (15%-15\%).   - (a) Explain how confounding makes establishing cause-and-effect difficult.   - (b) Explain how researchers could establish a cause-and-effect relationship (suggesting an experimental design).