sec+7.1+Student
Chapter 7: The Central Limit Theorem
Introduction
The Central Limit Theorem (CLT) is a fundamental concept in statistics that describes how sample means behave.
Sampling Distribution
Definition: The sampling distribution refers to the distribution of sample statistics, such as the mean, derived from multiple samples taken from a population.
Central Limit Theorem (CLT)
The CLT states that:
For large samples (size n), the means of these samples will approximately follow a normal distribution, regardless of the population's distribution.
Key elements:
Drawn from a population with known mean (μ) and standard deviation (σ).
As the sample size n increases, the histogram of sample means trends toward a normal bell shape.
Important Note:
The population distribution does not need to be known.
A sample size of at least 30 is typically seen as "large enough."
Sample Size Considerations
The adequacy of sample size (n) for applying the CLT depends on the underlying population distribution:
If original population is normal, smaller n may suffice.
If unknown or non-normal, n should be at least 30.
Practice Problems
Practice 1
Sample size n=50, μ=45, σ=8: Can CLT be applied? Yes.
Sample size n=10: Can CLT be applied? No.
Sample size n=50 (normal distribution): Can CLT be applied? Yes.
Central Limit Theorem for Sample Means (Averages)
Section 7.1 Learning Objectives
Students will use CLT properties to estimate the means and standard deviations of sampling distributions from sample means.
The Central Limit Theorem for Sample Means
If X is a random variable, its mean (μX) and standard deviation (σX) apply:
As n increases, the distribution of sample means becomes normally distributed.
Normal distribution symbol: ~ N(μX, σX/√n).
σX/√n is termed the Standard Error of the Mean (SEM).
Sampling Error
Definition: Variability observed in sample statistics due to random sampling.
"Error" denotes variability, not mistakes.
Examples of Sampling Error
Sampling Error Example 1
When studying behavioral issues in children, variability occurs between different samples due to randomness in selected subjects.
One sample may contain predominantly well-behaved children, while another may show higher instances of behavior problems.
Sampling Error Example 2
Conducting 10,000 samples and recording means produces a distribution of means with variability, showing a range of sample averages due to chance.
Sampling Error Example 3
Majority of sample means will cluster around the true population mean (45-55), indicating consistent representation.
Practice Problems
Practice 2
Scenario: Researching game strategies for 29-35 year-olds based on average gamer age.
Given mean age of strategy players is 28 (SD = 4.8), with a sample of 100 players showing a probability of 0.0186 for ages 29-35.
Question: Is the development strategy viable? Needs analysis of probability outcome.
Practice 3
Scenario: Cola beverage claims 16 ounces.
Sample n=34, sample mean = 16.01, μ = 16.00, σ = 0.143.
Questions:
Do results indicate cans are filled over 16 ounces?
Feelings from consumer and manufacturer perspectives?
Practice 4
Data: Females aged 18-24 have average systolic BP of 114.8 (SD = 13.1).
Sample of 40 females, probability mean BP > 120 is 0.3457.
Questions:
Interpret the probability outcome.
If using a sample of 4 females and distribution is unknown, can CLT be applied?
Answer: No, insufficient sample size.