Central Limit Theorem

The Central Limit Theorem is a fundamental theorem in probability statistics.
Goals of the study:
- Understand the sampling distribution of a statistic.
- Know what the CLT articulates about sampling distributions.
- Recognize when the CLT applies and when it doesn’t.

Initial Experiment: Selecting a random real number (x1) from the interval [1,3].
- Mean (μ) of x1 is 2 (midpoint of the interval).
- Variance formula: ( \sigma^2 = \frac{(b - a)^2}{12} ) where a=1 and b=3.
- Standard deviation (σ): ( \sigma = \sqrt{\frac{(3 - 1)^2}{12}} \rightarrow \sqrt{\frac{4}{12}} \rightarrow \sqrt{\frac{1}{3}} \approx 0.577.
- Histogram from 500 replications appears flat due to randomness.

Selecting 3 Random Numbers: (x1, x2, x3) and computing their mean (x̄).
- Histogram shows a peak in the center; variance in values (large and small) pulls mean x̄ towards the middle.
Selecting 30 Random Numbers (x1 to x30):
- Histogram of means x̄ from 500 replications shows a bell-shaped pattern.
- Good fit with the normal density function suggests that the sampling distribution of x̄ approaches normality as sample size increases.

The sampling distribution of a statistic is its probability distribution when treated as a random variable.
- Common examples:
  - Sum of the faces when rolling two fair dice.
  - Sample mean (x̄) is also a statistic to which the CLT applies.
  - Sample variance (s²) and sample proportion (p̂) also apply under certain conditions.

Population mean (μ) and variance (σ²) must be defined.
Rule of Thumb: Sample size (n) should typically be at least 30 for reliable results from the CLT.
- Mean of sampling distribution of x̄: μ = population mean.
- Variance of sampling distribution of x̄: ( \sigma^2/n ) (population variance divided by sample size).
- CLT indicates that x̄ is approximately normally distributed as sample size increases.

Example from earlier experiments:
- When 30 random numbers are selected, the expected results using CLT can be calculated:
  - Population mean is 2 and standard deviation is 0.577.
  - Resulting standard deviation of the sample mean: ( \frac{0.577}{\sqrt{30}} \approx 0.105.
Conditions for Application:
- Samples should be taken with replacement to ensure independence.
- Mean and variance of the population distribution must be finite.
- Empirically, if the population distribution is symmetric, smaller samples than 30 may still yield good results.

Population Distribution 1:
- Sample size n = 15: Not suitable (right skew).
- Sample size n = 35: Suitable for CLT application.
Population Distribution 2:
- Sample size n = 20: Not suitable (extreme right skew).
- Sample size n = 100: Suitable for applying CLT, as it is sufficiently large.

Central Limit Theorem is essential for understanding how sample distributions behave as sample sizes increase.
Further study will deepen quantitative applications of the Central Limit Theorem.