Normal Distribution
Normal Distribution
Introduction to Normal Distribution
- The binomial probability distribution, when plotted, often shows a pattern where probabilities are low on the outside and high in the middle.
- This pattern closely resembles a continuous curve known as the normal distribution, modeled by Carl Gauss.
- The normal distribution is also known as a bell-shaped curve due to its visual appearance.
Key Concepts
Definition
- Normal Distribution: A probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean.
Parameters
To define a normal distribution, only the mean and standard deviation are needed.
- Mean: The average value of the data set, located at the center of the curve.
- Standard Deviation: A measure of the spread or dispersion of the data around the mean.
Example
Consider the math portion of the SAT test given to a large group of students.
- Let's say the mean score is 500 and the standard deviation is 100.
Properties of the Normal Curve
Symmetry
- The normal curve is symmetric around the mean.
- For large datasets, the mean and median are approximately the same.
Spread
The curve almost reaches the x-axis at three standard deviations to the right and left of the mean.
- These points serve as horizontal asymptotes, meaning the curve approaches but never touches the x-axis.
Universality
- The normal distribution applies to almost any set of numbers, whether it's heights, IQ scores, or SAT test scores.
- Most of the data cluster around the middle, with extremes on both ends.
Percentages within Standard Deviations
- 68% of the data falls within one standard deviation of the mean.
- 95% of the data falls within two standard deviations of the mean.
- 99.7% of the data falls within three standard deviations of the mean.
Calculating Probabilities
Area Under the Curve
- The area under the entire curve represents 100% probability.
- The area between any two points on the curve represents the probability of a value falling within that range.
Example Problem
What is the probability that a student scored between 600 and 700 on the SAT test (mean 500, standard deviation 100)?
- 68% of students score between 400 and 600 (one standard deviation from the mean).
- 34% score between 500 and 600.
- 95% of students score between 300 and 700 (two standard deviations from the mean).
- Subtracting 68% from 95% leaves 27%, which means 13.5% score between 600 and 700, and 13.5% score between 300 and 400.
General Rule
- For any normally distributed data, the 68-95-99.7 rule can be used to answer probability questions.
Detailed Breakdown of Percentages
x̄ (x-bar), the mean, is in the middle.
Six standard deviations wide (three to the right and three to the left).
Between one standard deviation from the mean:
- 34% on each side, totaling 68%.
Between one and two standard deviations from the mean:
- 13.5% on each side.
Between two and three standard deviations from the mean:
- 2.35% on each side.
Outside three standard deviations from the mean:
- 0.15% on each side (since 99.7% is within three standard deviations).
Application Example: Height of Adult Females
Scenario
- The average height of adult females in the United States is 5 foot 4 inches (64 inches).
- The standard deviation is 2 inches.
Calculations
Create a normal curve with 64 inches in the middle.
Mark standard deviations:
- 62 inches (one standard deviation below the mean)
- 66 inches (one standard deviation above the mean)
- 60 inches (two standard deviations below the mean)
- 68 inches (two standard deviations above the mean)
- 58 inches (three standard deviations below the mean)
- 70 inches (three standard deviations above the mean)
Probability Question
What is the probability that a randomly chosen female is between 68 and 70 inches tall?
- This corresponds to the area between two and three standard deviations above the mean, which is 2.35%.
Generalization
- As long as the numbers align with standard deviations, probabilities can be easily calculated.
- For numbers in between standard deviations, a calculator or a table can be used to find the area (probability).