Density Curves and Normal Distribution
Module 3 - Section 2: Density Curves and Normal Distribution
Instructor: Rosana Fok
1. Introduction to Density Curves and Normal Model
Continuous random variables are represented using histograms.
Histograms break down the measurement scale into class intervals.
The area of the rectangle for each interval is proportional to the relative frequency of the data.
A smooth curve can fit onto the histogram, termed a density curve.
Example: Histogram of Walking Shoe Prices
Data categorization by price intervals from 0-100, with specific frequencies noted.
2. Properties of Density Curves
Conditions for Density Curves:
Must always remain on or above the horizontal axis.
Total area under the curve equals 1.
Area under the curve over an interval represents the proportion of total observations within that range.
Formal expression for probability between values:
P(a<x<b) is represented by the area under the curve between a and b.No probability is assigned to an exact value:
(this is not true for discrete random variables).The following equalities hold:
P(a < x < b) = P(a ≤ x ≤ b) = P(a < x ≤ b) = P(a ≤ x < b)
3. Example of a Density Curve
Uniform Distribution Example:
Consider a density curve in the interval [0, 5].
Verify area under curve:
Area = length × width =
Probability calculations using this density curve:
P(X > 3) = ext{area under curve from 3 to ∞} = 2 imes 0.2 = 0.4
Alternatively, calculate using complementary probability:
P(X > 3) = 1 - P(X ≤ 3) = 1 - 0.6 = 0.4
4. Understanding Normal Distribution
Many numerical variables exhibit bell-shaped histograms, such as heights, weights, and lifetimes of bulbs.
The normal distribution serves as an effective model for such types of data.
Notable as the most significant and widely utilized probability distribution.
5. Properties of Normal Distributions
Normal distribution curves have specific characteristics:
Symmetric, unimodal, and bell-shaped.
Each curve is determined by its mean (µ) and standard deviation (σ).
The mean (µ) marks the center of distribution and the peak of the density function.
The standard deviation (σ) dictates the spread/thickness of the curve.
Notation for normal models:
corresponds to a Normal distribution with mean µ and standard deviation σ.
6. Z-scores and Standard Normal Distribution
Standardized values of Normal data are termed z-scores.
The formula for computing z-scores is given by:
Z-scores follow a standard normal distribution, represented as
.
7. Assessing Normality Assumption
Applying a normal model rests on the assumption that the data distribution is indeed normal.
Given the impracticality of directly verifying this assumption, check for the following:
Nearly Normal Condition: The distribution should appear unimodal and symmetric.
Validation methods include:
Creating a histogram.
Generating a Normal probability plot.
Drawing a Q-Q plot.
8. Normal Probability Plot
A specialized graphical format to evaluate the appropriateness of a normal model.
If data distribution is normal, the plot will align along a diagonal line.
Deviations from this line suggest non-normal distribution traits.
9. Visual Representations of Normal Probability Plots
Histogram and Normal Probability Plot:
Near normal data illustrates a histogram and a normal probability plot indicating similarity to a straight line.
Skewed Distribution:
Displays a histogram and normal probability plot reflecting asymmetry.
10. The Empirical Rule (68-95-99.7 Rule)
The Empirical Rule is derived from observational patterns, showing that normal curves effectively model various variables.
The rule applies specifically to normal distributions, encapsulating the following:
Approximately 68% of observations are within 1 standard deviation (σ) of the mean (µ).
Approximately 95% are within 2 standard deviations.
Approximately 99.7% are within 3 standard deviations.
Representation of the rule visually correlates with a bell curve, denoting areas under the curve.
11. Example of Applying the Empirical Rule
Case Study:
Heights of 112 children follow a normal distribution characterized by:
Mean (µ) = 104.5
Standard Deviation (σ) = 16.3
Fill out the standard deviation intervals based on the Empirical Rule for different k values:
12. Closing
Thank you for engaging with this module regarding density curves and the principles of normal distribution!