Chapter 5: The Normal Distribution and Other Continuous Distributions
Chapter 5 Learning Objectives
Compute probabilities from the normal distribution and understand its characteristics.
Utilize the normal distribution to solve complex business problems.
Determine if a dataset is approximately normally distributed using normal probability plots.
Compute probabilities from the uniform distribution.
Compute probabilities from the exponential distribution.
Continuous Probability Distributions
Definition: A continuous random variable can assume any value within a defined interval or on a continuum (uncountable number of values).
Real-World Examples: * Thickness of a manufactured item. * Time required to complete a specific task. * Temperature of a chemical solution. * Height measured in inches.
Variable values depend solely on the ability to measure precisely and accurately.
Cumulative Distribution Function (CDF): Let be the CDF for a continuous random variable . It expresses the probability that does not exceed a value : * * For two possible values and where a < b, the probability that lies between them is: P(a < X < b) = F(b) - F(a).
Probability Density Function (PDF): The density function for random variable has the following core properties: 1. for all values of . 2. The total area under the probability density function over all possible values of equals . 3. The probability that lies between two values is the area under the density graph between those values. 4. The cumulative density function is defined as the area under the PDF from the minimum value up to : * * Where is the minimum value of . 5. The probability of any individual point is always zero: . Therefore: P(a \leq X \leq b) = P(a < X < b).
The Normal Distribution
Characteristics: * Bell-shaped and perfectly symmetrical around the mean. * The Mean, Median, and Mode are all equal. * Infinite theoretical range spanning from to .
Parameters: * Mean (): Determines the location (center) of the distribution. Shifting moves the distribution left or right. * Standard Deviation (): Determines the spread (width). Increasing increases the spread and flattens the curve.
Normal Density Function Formula: * * (mathematical constant). * (mathematical constant). * = any value of the continuous variable.
Standardized Normal Distribution (Z): * Any normal distribution can be transformed into the standardized normal distribution. * Mean . * Standard Deviation . * Translation formula (Z-score): . * Z-values specify the number of standard deviations a value is from the mean. Positive Z-values are above the mean, and negative Z-values are below the mean.
Z-score Example: If , for : * * This indicates the value is standard deviations (increments of ) above the mean of .
Probability Calculations Using the Z-Table
Cumulative Table: The Standardized Normal Table (e.g., Appendix E.2) provides probabilities for the area to the left of a specific Z-score (from to ). * Example: P(Z < 2.00) = 0.9772.
General Procedure: 1. Draw the normal curve for the problem in terms of . 2. Translate -values to -values. 3. Use the Z-table to find the required area.
Download Time Example: * Scenario: Mean download time , . * Problem: Find P(X < 18.6). * Step: . * Result: P(Z < 0.12) = 0.5478.
Upper Tail Probabilities: To find P(X > 18.6), use the complement rule: * P(Z > 0.12) = 1.0 - P(Z \leq 0.12) = 1.0 - 0.5478 = 0.4522.
Probabilities Between Two Values: To find P(18 < X < 18.6), calculate Z-scores for both points: * ; . * P(0 < Z < 0.12) = P(Z < 0.12) - P(Z < 0) = 0.5478 - 0.5000 = 0.0478.
Finding X for a Known Probability: Given a probability, find the corresponding value. 1. Find the Z-value for the known probability in the table. 2. Convert to units of using the formula: . 3. Example: Find such that of download times are less than . * P(Z < ?) = 0.20 \rightarrow Z = -0.84. * .
Evaluating Normality
Theoretical Properties: Normal data should be bell-shaped (symmetrical), follow the empirical rule, have an Interquartile Range (IQR) , and a range .
Visual Assessment: * Small data: Use stem-and-leaf displays or boxplots to check symmetry. * Large data: Check histograms or polygons for bell shape.
Descriptive Measures: Compare mean, median, and mode for similarity.
Normal Probability Plot (Q-Q Plot): * Data is arranged into an ordered array. * Standardized normal quantile values (Z) are calculated. * Observed data () is plotted on the vertical axis against quantile values () on the horizontal axis. * Linearity indicates a normal distribution. Nonlinear/curved plots indicate deviations like left-skew, right-skew, or rectangular distributions.
The Uniform Distribution
Definition: Outcomes are equally likely over a given range (rectangular distribution).
Density Function: *
Summary Measures: * Mean: * Standard Deviation:
Example (Range 2 to 6): * * * * Probability calculation .
The Exponential Distribution
Definition: Often used to model time between occurrences/arrivals.
Probability Density Function: * for X > 0 * = mean number of arrivals per unit time.
Summary Measures: * Mean time between arrivals () = . * Standard deviation () = .
Cumulative Probability: Probability that arrival time is less than specified time : * P(\text{arrival time} < x) = 1 - e^{-\lambda x}.
Example: Arrivals at per hour (). Probability between customers is less than minutes ( hours): * P(X < 0.05) = 1 - e^{-(15)(0.05)} = 1 - e^{-0.75} \approx 0.5276.
Normal Approximation of the Binomial Distribution
Rationale: Binomial calculations become tedious as grows large. A normal distribution with the same mean and standard deviation can approximate it.
Requirements: Approximation is valid if and .
Parameters: * *
Continuity Adjustment: Since binomial is discrete and normal is continuous, adjust points into intervals. * becomes P(k - 0.5 < W < k + 0.5). * becomes P(W > 24.5).
Example: tire production (, defect rate ). Probability of or fewer defects: * * * * P(Z < 2.07) = 0.9808.
Joint Distributions and Sums of Variables
Joint CDF: Defines the probability that variables are simultaneously less than specified values: F(x_1, \dots, x_k) = P(X_1 < x_1 \cap \dots \cap X_k < x_k).
Independence: Random variables are independent if and only if .
Covariance and Correlation: * . * If are independent, . * .
Sums and Differences: * . * .
Linear Combinations: For : * * .
Applied Problems and Exercises
Coffee Shop Staffing: , . 1. Quiet Day (X < 130): . 2. Busy Day (X > 180): Z = 1.5 \rightarrow P(Z > 1.5) = 1 - 0.9332 = 0.0668. 3. Average Day (150 < X < 170): .
Mathematics Test: . * Percentage below score : . * Top distinction: .
Cholesterol Classification: . * High cholesterol (X > 240): . * Conditional: .
Battery Lifetime: . * Warranty life for failure rate < 2\%: P(Z < ?) = 0.02 \rightarrow Z \approx -2.05 \rightarrow X \approx 418\ cycles.