Calculate Mean and Standard Deviation: Be able to calculate ̄{x} and interpret the standard deviation of the sampling distribution of a sample mean.
Examine Shape Influences: Explain how the shape of the sampling distribution of ̄{x} is affected by the population distribution shape and the sample size.
Difference in Sample Means: Calculate the mean and standard deviation of the sampling distribution of a difference in sample means ̄{x}_1 - ̄{x}_2 and interpret that standard deviation.
Normality Assessment: Determine if the sampling distribution of ̄{x}_1 - ̄{x}_2 is approximately Normal.
Probability Calculations: If appropriate, use a Normal distribution to calculate probabilities involving ̄{x} or ̄{x}_1 - ̄{x}_2.
Introduction to Quantitative Variables and Sample Statistics
Categorical vs. Quantitative Variables:
- Sample proportions (p^) typically arise when investigating categorical variables (e.g., "What proportion of adults watched a specific show?").
- Quantitative variables (e.g., household income, blood pressure, lifetime of car brake pads) require different statistics such as the median, mean, or standard deviation.
The Sample Mean: The sample mean ̄{x} is the most common statistic computed from quantitative data.
The Sampling Distribution of ̄{x}
Definition: The sampling distribution of the sample mean ̄{x} describes the distribution of values taken by the sample mean ̄{x} in all possible samples of the same size from the same population.
Activity Case Study: "Penny for Your Thoughts":
- Activity conducted by Mrs. Gallas’s class using a population of pennies.
- Students produced dotplots for samples of size n=5 and samples of size n=20.
Comparison of Sampling Distributions from the Activity:
- Shape: The distribution is slightly skewed to the left when n=5, but becomes roughly symmetric when n=20.
- Center: The distribution is centered at approximately the year 2002 for both sample sizes (̑{x} \approx 2002).
- Variability: The distribution of ̄{x} is about half as variable when using samples of size n=20 (σ_{̄{x}} \approx 2.6) compared to samples of size n=5 (σ_{̄{x}} \approx 5.2).
Mean and Standard Deviation of the Sampling Distribution of ̄{x}
General Principle: Suppose ̄{x} is the mean of an SRS of size n drawn from a large population with mean μ and standard deviation σ.
Mean of the Sampling Distribution (μ_{̄{x}}):
- μ_{̄{x}} = μ
- This indicates that the sample mean ̄{x} is an unbiased estimator of the population mean μ.
Standard Deviation of the Sampling Distribution (σ_{̄{x}}{}):
- σ_{̄{x}} = \frac{σ}{\sqrt{n}}
- This formula is approximately correct as long as the 10% condition is satisfied: n<0.10N.
- Interpretation: The value σ_{̄{x}} measures the typical distance between a sample mean and the population mean.
Behavior of ̄{x} in Repeated Samples
Variability Factors: The variability of ̄{x} depends on both the variability of the population (σ) and the sample size (n).
- Populations with higher variability result in more variable values of ̄{x}.
- Larger samples result in less variable values of ̄{x}.
- The Inverse Square Root Relationship: Specifically, multiplying the sample size by 4 cuts the standard deviation of its sampling distribution in half.
Sampling With vs. Without Replacement:
- With replacement: The standard deviation is exactly nσ.
- Without replacement: Observations are not independent, making the actual standard deviation smaller than the formula suggests. However, if the sample size is less than 10% of the population size, the formula is considered nearly correct.
- Note: Larger samples provide more information. If the sample size exceeds 10% of the population size, a "finite population correction" is required (though avoided in this specific text).
Shape Consistency: These facts regarding the mean and standard deviation of ̄{x} hold true regardless of the shape of the population distribution.
AP® Exam Tip: Proper Notation
Correct notation is vital on the AP Exam. Different symbols have specific meanings: p, ̄{x}, n, μ, μ_{̄{x}}, σ, σ_{̑{x}}, \hat{p}.
Using incorrect notation can cause a loss of credit. If unsure, it is better to avoid the notation than to use it incorrectly.
Example Problem: Movie Attendance Statistics
Problem Context: A large high school has a population mean of 19.3 movies viewed in the last year with a standard deviation of 15.8 movies.
Sample Details: An SRS of n=100 students is taken.
Questions and Solutions:
- (a) Identify the mean of the sampling distribution:
- Solution: \u03BC_{̄{x}} = μ = 19.3\text{ movies}.
- (b) Calculate and interpret the standard deviation; verify the 10% condition:
- Condition Verification: Assume n=100 is less than 10% of the total students at the large high school.
- Calculation: \u03C3_{̄{x}} = \frac{15.8}{\sqrt{100}} = \frac{15.8}{10} = 1.58\text{ movies}.
- Interpretation: In SRSs of size 100, the sample mean number of movies viewed will typically vary by about 1.58 movies from the true population mean of 19.3 movies.
Mathematical Derivation of Mean and Standard Deviation (Think About It)
The Independent Random Variable Model: Let measurements in a sample of size n be X1,X2,…,Xn. For a large population, these are independent random variables with mean μ and standard deviation σ.
Definition of Sample Mean:0˘304x=n(X1+X2+⋯+Xn).
Let T be the sum:T=X1+X2+⋯+Xn.
Mean Calculation for T and ̄{x}:
- 0˘3BCT=μ+μ+⋯+μ=nμ
- Since 0˘304x=(n1)T, then \u03BC_{̄{x}} = (\frac{1}{n})μ_T = (\frac{1}{n})(nμ) = μ
Standard Deviation Calculation for T and ̄{x}:
- Using variance rules: 0˘3C3T2=σ2+σ2+⋯+σ2=nσ2
- Standard deviation of the sum: 0˘3C3T=nσ2=σn
- Since 0˘304x=(n1)T, then \u03C3_{̄{x}} = (\frac{1}{n})σ_T = \frac{σ\sqrt{n}}{n} = \frac{σ}{\sqrt{n}}
Section 7.3 Practice and Review Questions
Exercise 54: Sample Size and Standard Deviation
- Decreasing sample size from 750 to 375 (halving the size) would multiply the standard deviation by:
- (a) 2
- (b) 2
- (c) 1/2
- (d) 1/2
- (e) none of these.
**Exercise 55: Normality of p^
- The sampling distribution of p^ is approximately Normal because:
- (a) there are at least 7500 Division I college athletes.
- (b) np=225 and n(1−p)=525 are both at least 10.
- (c) a random sample was chosen.
- (d) the responses are quantitative.
- (e) it always has this shape.
Exercise 56: Party Affiliation Probabilities
- In a district where 55% of voters are Democrats, which expression represents the probability of getting less than 50% Democrats in a sample of size 100?
- Target probability: P(Z<1000.55(0.45)0.50−0.55).
Exercise 57: Music Sharing Table (Recycle and Review)
- Data provided: 29% download music, 21% share music, 12% do both.
- (a) Create a two-way table.
- (b) What percent of users neither download nor share?
- (c) Given a user downloads music, what is the probability they also share?
Exercise 58: Whole Grains Observational Study
- Context: People eating 3 servings of whole grains daily have low risk of heart disease (−20%), stroke, or cancer (−15%).
- (a) Explain how confounding makes establishing cause-and-effect difficult.
- (b) Explain how researchers could establish a cause-and-effect relationship (suggesting an experimental design).