Comprehensive Notes on Continuous Probability Distributions and the Normal Distribution
Comparison of Discrete and Continuous Probability Distributions
Conceptual Distinction
Discrete Variables (Chapter 5 Recap): Involve countable outcomes that are integer values. A key example is the number of people in a room; it is impossible to have values like or people. On a number line, outcomes (e.g., , , and ) are distinct points.
Continuous Variables (Chapter 6): Involve uncountable outcomes that can take on any real value within a range. For instance, on a number line between and , a variable could be exactly , , or even . The possibilities are effectively endless because the range can be partitioned into infinitely small segments. These typically involve measurements such as volume, height, length, and time.
Function Types
Probability Mass Function (PMF): Used for discrete distributions. On a graph where the x-axis represents outcomes () and the y-axis represents probability, the probability is equal to the height of the line at a specific point.
Probability Density Function (PDF): Used for continuous distributions. On a graph where the x-axis represents outcomes () and the y-axis represents a function of x (), the graph takes on a specific shape (e.g., a bell curve). Probability is no longer height; it is measured as the area under the curve between two points ( and ).
Mathematical Rules
Discrete Rule: The sum of all probabilities in the PMF must equal one: .
Continuous Rule: The entire area under the PDF curve must equal one. Mathematically, the integral of the function from negative infinity to infinity equals one: .
Specific Point Probabilities
In discrete distributions, we can calculate probabilities at exact points, such as .
In continuous distributions, the probability at an exact point is always zero: . Because the range is infinite and we require area (width times height) to find probability, a single vertical line has no width and thus no area. Instead, we calculate probabilities over intervals, such as P(4.5 < X < 5.5), to look at the "neighborhood" of a value.
The Normal Distribution
Significance and Occurrence
The normal distribution is considered a "rule of the universe" because it emerges frequently in real-world data across various fields. If plotted on a histogram, many datasets naturally form a symmetric bell shape.
It is the foundation of the Central Limit Theorem, which suggests that the distribution of sample means will tend toward a normal distribution regardless of the original population's distribution shape.
Theoretical Properties and Notation
Shape: Symmetric bell-shaped curve with high density in the center and fading tails.
Notation: .
Parameters:
The Mean (): Determines the center of the distribution. Shifting the mean moves the entire curve along the x-axis (e.g., from to to ).
The Variance () or Standard Deviation (): Determines the variability or width of the curve. A larger standard deviation results in a wider, flatter curve, while a smaller standard deviation creates a narrower, taller curve. Standard deviation measures the tendency of data to deviate from the mean.
Conversion: To get standard deviation from variance, find . To get variance from standard deviation, find .
The Probability Density Function (PDF) Formula
The mathematical form governing the bell shape is: .
Components: Includes the mathematical constant () and the natural base . Students are required to recognize this formula as the normal distribution PDF but generally do not solve it by hand.
Standardizing the Distribution (Z-Scores)
Integration and Closed Form Solutions
Finding the area under the normal curve theoretically requires integration: . However, this integral has no closed-form solution (it cannot be solved with standard calculus formulas). It must be solved numerically via computers or calculators.
The Standard Normal Distribution ()
To simplify calculations, every normal distribution can be converted to the "Universal Scale" known as the Standard Normal Distribution.
Notation: , where the mean is always and the standard deviation is always .
Transformation Formula: .
Calculating Probabilities Using the HP Calculator
The Procedure
Step 1: Convert the given value to a value using the standardization formula.
Step 2: Use the calculator to find the probability . The HP calculator is designed specifically to work with "less than" () probabilities.
Calculator Key Sequences
To find probability from a Z-value:
Enter the value (e.g., ).
Press the Blue Shift button.
Press button 3 (indexed as "zed to p" or ).
To adjust decimal display:
Press the Orange Shift button (downward arrow).
Press the Equals (, Display) button.
Type the desired number of decimals (e.g., ).
Handling "Greater Than" (>) Problems
A "greater than" sign is a "reason for pause" because the calculator only provides "less than" area.
Complement Rule: Since the total area is , the probability of the right tail is calculated as: P(X > x) = 1 - P(X < x).
Equality Notation in Chapter 6
In discrete distributions (Chapter 5), the difference between < and is critical.
In continuous distributions (Chapter 6), they are effectively the same because the probability at any specific point is zero. Thus, P(X < 18) is indistinguishable from .
Example Case Study: Image Download Time
Scenario: represents the time in seconds to download an image file. is normally distributed with seconds and seconds.
Problem A: Calculate P(X < 18.6)
Convert to Z: .
Calculator Entry: Input , press Blue Shift, then 3.
Result: P(X < 18.6) = 0.5478. This represents the area to the left of on the distribution curve.
Problem B: Calculate P(X > 18.6)
Apply Complement Rule: P(X > 18.6) = 1 - P(X < 18.6).
Substitution: .
Questions & Discussion
Question: How did you get more decimals on the calculator?
Response: Use the Orange Shift button, then the Equals button (which corresponds to "Display" in orange text), and then hit the number corresponding to the decimals you want (usually ).
Question/Note on Tables: The lecturer demonstrated a standard table, showing that for , you find the intersection of row and column to get .
Note Regarding Examinations: Traditional printed tables are being phased out. Students are explicitly required to have and use the HP calculator for exams, as it is also necessary for the second-semester course, Theory of Interest.