M4 2 NormalDistribution&EmipiricalRule

Normal Distribution Overview

  • The normal distribution is essential for understanding probability in statistics.

  • Key features of the normal distribution include:

    • Unimodal: It has one peak or mode.

    • Symmetric: The left side of the curve mirrors the right side.

    • Bell Curve Shape: It resembles a bell with a smooth curve.

    • Area Under the Curve: Total area equals one, representing total probability.

  • Unlike the binomial distribution, which has fixed values (e.g., from 0 to 10 for a sample size of 10), the normal distribution is unbounded, allowing for theoretically infinite values in both directions.

Characteristics of Normal Distribution

  • The normal distribution is a continuous probability distribution, contrasting with other discrete distributions.

  • Be cautious with the term "normal" as it has specific statistical implications.

  • Anecdote: Using "normal" in statistical contexts can be humorous among non-statisticians when discussing distributions that appear abnormal.

Empirical Rule (68-95-99.7 Rule)

  • The empirical rule describes the distribution of data within standard deviations of the mean:

    • 68% of data falls within one standard deviation from the mean (±1σ).

    • 95% of data falls within two standard deviations from the mean (±2σ).

    • 99.7% of data falls within three standard deviations from the mean (±3σ).

  • Each normal distribution can be uniquely defined by its mean (µ) and standard deviation (σ).

    • Smaller standard deviations lead to a taller and narrower curve, while larger standard deviations create a flatter curve.

Properties of the Curve

  • The normal distribution curve extends infinitely to both left and right but approaches the x-axis asymptotically, meaning it gets closer but never actually touches the x-axis.

  • While the curve goes on forever, the area under the edges (far left and right) is minimal (approximately 0.15).

  • The empirical percentages are good estimates for rough calculations, though not 100% precise.

Relation to Chebyshev's Theorem

  • Chebyshev's theorem states that regardless of the distribution shape, at least 75% of data lies within two standard deviations of the mean.

  • For the normal distribution, more precise values can be given: 95% of data lies within two standard deviations.

Applications of Area Under the Curve

  • The area under the normal distribution curve is used to calculate probabilities and find z-scores.

  • Example: The probability of a value falling within ±1 standard deviation of the mean is 68%, directly linked to the area of that section of the curve.

robot