Normal Distribution – Comprehensive Study Notes
Historical Development of the Normal Distribution
Abraham de Moivre (French mathematician & astronomer)
Authored The Doctrine of Chances (1718).
Showed how probabilities from repeated binary events (e.g.
coin-tosses) gradually take on a bell-shaped (normal) form.Book became popular with gamblers because it let them estimate odds.
Carl Friedrich Gauss (German mathematician)
Derived the exact mathematical function for the normal curve once the population mean (\mu) and standard deviation (\sigma) are known.
Hence the normal curve is often called the Gaussian distribution.
Adolphe Quetelet (Belgian mathematician & statistician)
Transferred astronomical/physical uses of the normal distribution into the social sciences (crime rates, marriage rates, etc.).
Argued that all people possess “average social traits,” positioned along a normal curve.
Sir Francis Galton (English polymath, cousin of Darwin)
Applied normal-curve reasoning to human intelligence.
Asserted intelligence ranges from very low to very high with most people in the middle (bell-shaped).
Proposed controversial eugenic ideas (only the “more intelligent” should reproduce).
Famous quote from Natural Inheritance praising the “cosmic order” of the normal law.
From Binary Events to a Normal Curve
Single coin toss
Outcomes: Heads or Tails; probability P=0.5 for each.
Two tosses
Four ordered sequences (HH, HT, TH, TT).
Distribution of heads counts:
• 0 heads: 1/4 • 1 head: 2/4 • 2 heads: 1/4.
Four tosses
Enumerating all 2^4=16 sequences and plotting “number of heads” produces a histogram that visibly approaches a bell shape.
General principle
As the number of independent, identically distributed (i.i.d.) binary events grows, the sampling distribution of the count/mean tends toward a normal curve (early intuitive glimpse of the Central Limit Theorem).
Normal-Curve Examples in Psychology & Beyond
Wechsler Intelligence Scale (IQ)
Mean (M) =100, SD =15.
Score \ge 130 places an individual above 98\% of the population (top 2\%).
Depression scores (DASS subscale)
Skewed, not normal: most people show very low depression, minority show high scores (positive skew).
Illustrates that not all real-world data are normal; checking distributional shape is essential.
Student height example
Raw histogram not perfectly symmetrical, yet overlaying a fitted normal curve estimates the underlying population distribution.
Core Mathematical Formulation
Probability-density function (pdf):
f(x)=\frac{1}{\sigma\sqrt{2\pi}}\,e^{-\frac12\left(\frac{x-\mu}{\sigma}\right)^2}Inputs: only the population mean (\mu) and population standard deviation (\sigma).
Constants: \pi and the factor \sqrt{2\pi} are fixed.
Area under the curve = 1 (total probability).
Effect of Changing Parameters
Fix \mu, vary \sigma (red, blue, yellow curves):
Larger \sigma ⇒ curve spreads horizontally, peak lowers, fatter tails (reduces kurtosis).
Smaller \sigma ⇒ curve narrows, peak height increases.
Fix \sigma, vary \mu (green curve demo):
Entire curve shifts left/right without shape change.
Idealised (Population) vs Sample Descriptions
Graphs/derivations assume true population parameters (\mu,\sigma).
Empirical data give sample statistics (\bar{x}, s), which estimate the population values.
Many inferential tests (t-test, ANOVA, regression, etc.) rely on an assumption of normality (either of raw data or of residuals).
If assumption holds → tests are more powerful & accurate.
Non-normal data can be handled with transforms or non-parametric tests.
Defining Properties of a Normal Distribution
Symmetrical about the mean.
Unimodal (single highest peak).
Bell-shaped with tails extending to \pm\infty (the pdf never actually reaches zero).
Measures of central tendency coincide:
\text{mean} = \text{median} = \text{mode}.
Visual Diagnostics (SPSS examples)
Histogram + fitted curve shows approximate symmetry.
Box-and-whisker plot: equal whisker lengths left/right of median implies symmetry.
Q-Q plot can further verify linearity against theoretical quantiles (mentioned implicitly by “three different ways”).
The 68\;–\;95\;–\;99.7 Empirical Rule
In any normal distribution:
68\% of observations lie within \pm1\,\sigma of \mu.
95\% lie within \pm2\,\sigma.
99.7\% lie within \pm3\,\sigma.
Provides quick probability estimates & aids in outlier detection.
Worked Example (Test Scores)
Given: \mu=34, \sigma=8.5.
\pm1\sigma band:
Lower bound =34-8.5=24.5.
Upper bound =34+8.5=42.5.
Contains 68\% of students.
\pm2\sigma band:
Lower bound =34-2(8.5)=17.0.
Upper bound =34+2(8.5)=51.0.
Contains 95\% of students.
\pm3\sigma band:
Lower bound =34-3(8.5)=8.5.
Upper bound =34+3(8.5)=59.5.
Contains 99.7\% of students; only 0.3\% \;(≈3 in 1000) fall outside → classify as outliers.
Practical, Ethical & Philosophical Implications
Comparing individuals to reference populations (e.g.
IQ, height): enables percentile ranks, cut-scores for diagnostics, giftedness, etc.Policy & ethics: misuse of “average” or “extreme” labels (e.g.
Galton’s eugenic ideas) highlights need for ethical caution.Real-world nuance: Many psychological traits (e.g.
depression) are skewed or multi-modal; always check empirical shape rather than assume normality.
Key Take-aways
Normal distributions are common yet not universal; always verify shape.
The curve is fully determined by two parameters \mu,\sigma.
Historical development links gambling, astronomy, social science & psychology.
Gauss’s formula plus the 68\,95\,99.7 rule give powerful shortcuts for probability & outlier assessment.
Understanding normality underpins many classical statistical tests and interpretations.