Central Limit Theorem

Central Limit Theorem (CLT) Overview

  • The Central Limit Theorem is a fundamental theorem in probability statistics.

  • Goals of the study:

    • Understand the sampling distribution of a statistic.

    • Know what the CLT articulates about sampling distributions.

    • Recognize when the CLT applies and when it doesn’t.

Sampling Distribution of a Statistic

  • Initial Experiment: Selecting a random real number (x1) from the interval [1,3].

    • Mean (μ) of x1 is 2 (midpoint of the interval).

    • Variance formula: ( \sigma^2 = \frac{(b - a)^2}{12} ) where a=1 and b=3.

    • Standard deviation (σ): ( \sigma = \sqrt{\frac{(3 - 1)^2}{12}} \rightarrow \sqrt{\frac{4}{12}} \rightarrow \sqrt{\frac{1}{3}} \approx 0.577.

    • Histogram from 500 replications appears flat due to randomness.

Increasing Sample Size

  • Selecting 3 Random Numbers: (x1, x2, x3) and computing their mean (x̄).

    • Histogram shows a peak in the center; variance in values (large and small) pulls mean x̄ towards the middle.

  • Selecting 30 Random Numbers (x1 to x30):

    • Histogram of means x̄ from 500 replications shows a bell-shaped pattern.

    • Good fit with the normal density function suggests that the sampling distribution of x̄ approaches normality as sample size increases.

Sampling Distribution Characteristics

  • The sampling distribution of a statistic is its probability distribution when treated as a random variable.

    • Common examples:

      • Sum of the faces when rolling two fair dice.

      • Sample mean (x̄) is also a statistic to which the CLT applies.

      • Sample variance (s²) and sample proportion (p̂) also apply under certain conditions.

Assumptions of the Central Limit Theorem

  • Population mean (μ) and variance (σ²) must be defined.

  • Rule of Thumb: Sample size (n) should typically be at least 30 for reliable results from the CLT.

    • Mean of sampling distribution of x̄: μ = population mean.

    • Variance of sampling distribution of x̄: ( \sigma^2/n ) (population variance divided by sample size).

    • CLT indicates that x̄ is approximately normally distributed as sample size increases.

Applying the Central Limit Theorem

  • Example from earlier experiments:

    • When 30 random numbers are selected, the expected results using CLT can be calculated:

      • Population mean is 2 and standard deviation is 0.577.

      • Resulting standard deviation of the sample mean: ( \frac{0.577}{\sqrt{30}} \approx 0.105.

  • Conditions for Application:

    • Samples should be taken with replacement to ensure independence.

    • Mean and variance of the population distribution must be finite.

    • Empirically, if the population distribution is symmetric, smaller samples than 30 may still yield good results.

Evaluating Cases for Application of CLT

  • Population Distribution 1:

    • Sample size n = 15: Not suitable (right skew).

    • Sample size n = 35: Suitable for CLT application.

  • Population Distribution 2:

    • Sample size n = 20: Not suitable (extreme right skew).

    • Sample size n = 100: Suitable for applying CLT, as it is sufficiently large.

Conclusion

  • Central Limit Theorem is essential for understanding how sample distributions behave as sample sizes increase.

  • Further study will deepen quantitative applications of the Central Limit Theorem.