The normal distribution is essential for understanding probability in statistics.
Key features of the normal distribution include:
Unimodal: It has one peak or mode.
Symmetric: The left side of the curve mirrors the right side.
Bell Curve Shape: It resembles a bell with a smooth curve.
Area Under the Curve: Total area equals one, representing total probability.
Unlike the binomial distribution, which has fixed values (e.g., from 0 to 10 for a sample size of 10), the normal distribution is unbounded, allowing for theoretically infinite values in both directions.
Characteristics of Normal Distribution
The normal distribution is a continuous probability distribution, contrasting with other discrete distributions.
Be cautious with the term "normal" as it has specific statistical implications.
Anecdote: Using "normal" in statistical contexts can be humorous among non-statisticians when discussing distributions that appear abnormal.
Empirical Rule (68-95-99.7 Rule)
The empirical rule describes the distribution of data within standard deviations of the mean:
68% of data falls within one standard deviation from the mean (±1σ).
95% of data falls within two standard deviations from the mean (±2σ).
99.7% of data falls within three standard deviations from the mean (±3σ).
Each normal distribution can be uniquely defined by its mean (µ) and standard deviation (σ).
Smaller standard deviations lead to a taller and narrower curve, while larger standard deviations create a flatter curve.
Properties of the Curve
The normal distribution curve extends infinitely to both left and right but approaches the x-axis asymptotically, meaning it gets closer but never actually touches the x-axis.
While the curve goes on forever, the area under the edges (far left and right) is minimal (approximately 0.15).
The empirical percentages are good estimates for rough calculations, though not 100% precise.
Relation to Chebyshev's Theorem
Chebyshev's theorem states that regardless of the distribution shape, at least 75% of data lies within two standard deviations of the mean.
For the normal distribution, more precise values can be given: 95% of data lies within two standard deviations.
Applications of Area Under the Curve
The area under the normal distribution curve is used to calculate probabilities and find z-scores.
Example: The probability of a value falling within ±1 standard deviation of the mean is 68%, directly linked to the area of that section of the curve.