Study Notes on The Normal Distribution
Section 5.5: The Normal Distribution
Continuous Probability Distributions
A continuous random variable can assume any numerical value across one or more intervals.
Examples:
Car mileage.
Temperature.
A continuous probability distribution is used to assign probabilities to intervals of values.
Density Curves
A density curve is a mathematical model representing a distribution.
Key Properties:
The total area under the density curve is defined to equal 1 (or 100%).
The area under the curve for any range of values represents the proportion of observations within that range.
Visualization Example:
A histogram of a sample may include a smoothed density curve that depicts the theoretical population distribution.
Characteristics of Density Curves
The function f(x) denotes the height of the density curve at a specific value of x.
Two Essential Requirements of Density Curves:
f(x) must be greater than or equal to 0 (f(x) ≥ 0).
The total area under the curve must equal 1.
A density curve illustrates the overall pattern of a distribution, with the area under the curve representing the proportion of all observations that are located within a specified range.
Graphical Representation:
An example might show the area = 1.
Variability in Density Curves
Density curves can take on virtually any shape:
Some shapes are well-known mathematically; others are less common.
Mean and Median of Density Curves
Median:
The median of a density curve is defined as the equal-areas point, which divides the area under the curve into two equal halves.
Mean:
The mean is referred to as the balance point; it is the point at which the curve would balance if it were made of solid material.
For symmetric density curves, the mean and median are identical.
For skewed curves, the mean is influenced by the direction of the long tail:
The mean is pulled in the direction of the skewness.
The Normal Probability Distribution
Natural Constants:
e = 2.71828… (base of the natural logarithm).
π (pi) ≈ 3.14159…
Definition:
Normal (or Gaussian) distributions are a family of symmetrical, bell-shaped density curves defined specifically by:
Mean (m).
Standard deviation (s).
A normal random variable can take any real value spanning from -infinity to +infinity.
Families of Density Curves
Variations in means and standard deviations:
Examples with different means (m = 10, 15, and 20) while keeping standard deviations constant (s = 3).
Other examples maintain the same mean (m = 15) but have varying standard deviations (s = 2, 4, and 6).
Position and Shape of the Normal Curve
Different Means with Equal Standard Deviations:
If mean μ₁ is greater than mean μ₂, the curve with mean μ₁ is shifted to the right compared to the curve with mean μ₂.
Same Mean with Different Standard Deviations:
When standard deviation σ₁ is greater than σ₂, the curve corresponding to σ₁ is flatter and more spread out than the curve for σ₂.
Normal Distribution Characteristics
Notation:
Represented as X hicksim N(m, s) where m is the mean and s is the standard deviation.
Probability Analysis:
In a continuous distribution, since there are infinite possible values, the probability of any single specific value occurring is essentially zero:
P(X = x) = 0
Consequently, we only determine probabilities over ranges of values:
P(X < x)
Normal Probabilities Calculation
Specific Probability Expression:
To find the probability that a random variable x falls between two values a and b:
P(a ext{ ≤ } x ext{ ≤ } b)
Empirical Rule (68-95-99.7 Rule)
For a normal distribution:
Approximately 68.26% of observations fall within one standard deviation from the mean (µ ± 1σ).
About 95.44% fall within two standard deviations (µ ± 2σ).
Nearly 99.73% fall within three standard deviations (µ ± 3σ).
Graph Representation:
Illustrates the percentage of observed values within specified intervals around the mean.
Standardization and Z-Scores
A z-score quantifies the number of standard deviations a data value x is from the mean m.
Standardizing Steps:
When x is above the mean, the z-score is positive.
When x is below the mean, the z-score is negative.
If x is one standard deviation above the mean, z = 1; if two standard deviations above the mean, z = 2.
Finding Normal Curve Areas Using Z-Scores
Z-score formula:
Z = \frac{x - \mu}{\sigma}Example Parameters:
Considering intervals with respect to the mean \mu and standard deviations:
\mu - 30
\mu + 20
\mu + \sigma
\mu + 2\sigma
Standard Normal Curve:
A normal curve can be standardized, whereby:
The mean is 0 and the standard deviation is 1, allowing for universal application of z-scores.
Graphical Reference:
Standard normal curve illustration with z-scores marked from -3 to +3 relative to the mean.