Notes on the Normal Distribution and Standard Deviation
Notation and core ideas
- Sigma (σ) denotes the standard deviation; it measures the variation or spread of a random variable around its mean. It answers: how far, on average, data deviate from the mean.
- The standard deviation is a measure of spread with the same units as the variable X.
- Common shorthand: the normal distribution is often written as X ~ N(μ, σ^2), where
- μ is the mean (center of the distribution),
- σ is the standard deviation, and
- σ^2 is the variance.
- The statement “X is normal” typically means X is distributed normally, i.e., X ~ N(μ, σ^2).
- Observed values are written in lowercase x; random variables are written in uppercase X.
- The phrase “norm is normalized” refers to standardizing a normal variable to a standard normal form.
Normal distribution basics
- Distribution notation: X \sim \mathcal{N}(μ, σ^2) which means X has a normal distribution with mean μ and variance σ^2.
- Mean and variance:
- Mean: \,E[X] = μ
- Variance: \mathrm{Var}(X) = σ^2
- Standard deviation: \mathrm{SD}(X) = σ
- The normal distribution is symmetric and bell-shaped; many real-valued phenomena are modeled as approximately normal when they are the sum of many small independent effects (Central Limit Theorem).
Standardization (normalization) to the standard normal
- Standardization transforms X into a standard normal variable Z with mean 0 and variance 1:
- Z = X \2{\mu}{\sigma} = \frac{X - μ}{σ}
- Z \sim \mathcal{N}(0, 1)
- Purpose: enables comparison across different normal distributions and facilitates use of standard normal tables (z-tables) for probabilities.
- If you know μ and σ, you can convert any X to Z and read probabilities from the standard normal distribution.
Worked example
- Suppose a measurement X has parameters μ = 100 and σ = 15.
- If X = 115, the standardized value is
- Z = \frac{115 - 100}{15} = 1
- Interpretation: the observed value 115 is 1 standard deviation above the mean.
- If X is exactly 100, then Z = 0 (at the mean).
Probability density and cumulative concepts (normal case)
- PDF of X when X ~ N(μ, σ^2):
- f_X(x) = \frac{1}{σ \,\sqrt{2\pi}} \exp\left(-\frac{(x - μ)^2}{2σ^2}\right)
- CDF (probability up to x):
- FX(x) = P(X \le x) = \int{-\infty}^{x} f_X(t) \, dt
- Probability within an interval:
- P(a \le X \le b) = \int{a}^{b} fX(x) \, dx
- For many practical purposes, probabilities are computed via standard normal tables or software using the z-score transformation.
Notational pitfalls and clarifications
- Distinguish: uppercase X (random variable) vs lowercase x (a realized value).
- When we write X ~ N(μ, σ^2), we mean the distribution of the random variable X, not a specific observed value.
- The term “normalized” commonly refers to converting to the standard normal form via Z = (X - μ)/σ.
Connections and relevance
- The normal model is a foundational assumption in many statistical methods (confidence intervals, hypothesis tests, regression residuals) due to the Central Limit Theorem and mathematical convenience.
- Real-world relevance: measurement error, natural phenomena with many small additive effects, and standardization for comparability across datasets.
- Variance and standard deviation:
- \mathrm{Var}(X) = σ^2
- \mathrm{SD}(X) = σ
- Normal distribution notation:
- X \sim \mathcal{N}(μ, σ^2)
- Standardization:
- Z = \frac{X - μ}{σ}, \quad Z \sim \mathcal{N}(0, 1)
- Probability density function (PDF):
- f_X(x) = \frac{1}{σ \sqrt{2\pi}} \exp\left(-\frac{(x - μ)^2}{2σ^2}\right)
- CDF and interval probability:
- FX(x) = P(X \le x) = \int{-\infty}^{x} f_X(t) \, dt
- P(a \le X \le b) = \int{a}^{b} fX(x) \, dx