Sample Mean and Central Limit Theorem

Central Limit Theorem (CLT) Overview

The Central Limit Theorem (CLT) is a fundamental statistical principle that describes the behavior of the sample mean from a population.

Key Concepts

Sample Mean (X̄) Distribution:

  • When a random sample of size n is drawn from a population with mean (µ) and standard deviation (σ):

    • X̄ approaches a normal distribution as n increases, irrespective of the population distribution shape.

    • Mean of sample means = µ

    • Standard deviation of sample means (σX̄) = σ/√n

Behavior of Sample Means:

  • If the population distribution is exactly normal:

    • The sample mean will also follow a normal distribution.

  • As sample size n increases:

    • The distribution of sample means narrows around population mean (µ).

    • A large enough sample size results in an approximately normal distribution of the sample means, even if the population isn't normal.

Importance of Central Limit Theorem:

The Central Limit Theorem is crucial because it allows statisticians to make inferences about a population from sample data. It ensures that the sampling distribution of the sample mean will be normal if the sample size is sufficiently large, facilitating hypothesis testing and confidence interval estimation. This theorem underpins many statistical methods, making it foundational in the fields of statistics and data analysis.

Midpoint of the Sampling Distribution:

The midpoint of the sampling distribution is equal to the population mean (µ). As sample sizes increase, the distribution of the sample means becomes narrower than the population distribution, indicating that the sample means are clustered around the population mean more tightly.

Example Illustration

Uniform Population Case:

  • Consider a uniform population consisting of integers {0, 1, 2, 3}.

  • A table of all possible samples (n=2) illustrates sample combinations (e.g., (0,0), (1,0), (0,1), etc.).

Means of Samples:

  • Compute means from the samples showing a uniform distribution with corresponding sample means.

Graphical Representation

Histogram Observations:

  • Figure 1: Histogram of data shows the distribution of sample combinations.

  • Figure 2: Histogram of sample means shows a normal distribution centered around mean 1.5, demonstrating the CLT.

Applying the Central Limit Theorem

Expected Range of Sample Means:

  • According to CLT, intervals for sample means can be predicted using:

    • Expected Range: µ ± z * (σ/√n)

Example Problem

Population Characteristics:

  • Population Mean (µ) = 8

  • Population Standard Deviation (σ) = 3

  • Random Sample Size (n) = 36

Probability Calculation:

  • Determine probability that sample mean falls between 7.8 and 8.2:

  • Sampling distribution of X̄ approximated as normal with:

    • Mean (µX̄) = 8

    • Standard Deviation (σX̄) = σ/√n = 3/√36 = 0.5

  • Convert to Z-scores:

    • P(7.8 - 8) / (3/√36) < Z < (8.2 - 8) / (3/√36)

    • P(-0.4 < Z < 0.4)

Result Calculation:

  • P(-0.4 < Z < 0.4) = 0.6554 - 0.3446 = 0.3108