Comprehensive Notes on Continuous Probability Distributions and the Normal Distribution

Comparison of Discrete and Continuous Probability Distributions

  • Conceptual Distinction

    • Discrete Variables (Chapter 5 Recap): Involve countable outcomes that are integer values. A key example is the number of people in a room; it is impossible to have values like 1.51.5 or 2.52.5 people. On a number line, outcomes (e.g., 11, 22, and 33) are distinct points.

    • Continuous Variables (Chapter 6): Involve uncountable outcomes that can take on any real value within a range. For instance, on a number line between 11 and 33, a variable could be exactly 1.51.5, 1.61.6, or even 1.9999999991.999999999. The possibilities are effectively endless because the range can be partitioned into infinitely small segments. These typically involve measurements such as volume, height, length, and time.

  • Function Types

    • Probability Mass Function (PMF): Used for discrete distributions. On a graph where the x-axis represents outcomes (xx) and the y-axis represents probability, the probability is equal to the height of the line at a specific point.

    • Probability Density Function (PDF): Used for continuous distributions. On a graph where the x-axis represents outcomes (xx) and the y-axis represents a function of x (f(x)f(x)), the graph takes on a specific shape (e.g., a bell curve). Probability is no longer height; it is measured as the area under the curve between two points (aa and bb).

  • Mathematical Rules

    • Discrete Rule: The sum of all probabilities in the PMF must equal one: P(x)=1\sum P(x) = 1.

    • Continuous Rule: The entire area under the PDF curve must equal one. Mathematically, the integral of the function from negative infinity to infinity equals one: f(x)dx=1\int_{-\infty}^{\infty} f(x) \, dx = 1.

  • Specific Point Probabilities

    • In discrete distributions, we can calculate probabilities at exact points, such as P(X=5)P(X = 5).

    • In continuous distributions, the probability at an exact point is always zero: P(X=5)=0P(X = 5) = 0. Because the range is infinite and we require area (width times height) to find probability, a single vertical line has no width and thus no area. Instead, we calculate probabilities over intervals, such as P(4.5 < X < 5.5), to look at the "neighborhood" of a value.

The Normal Distribution

  • Significance and Occurrence

    • The normal distribution is considered a "rule of the universe" because it emerges frequently in real-world data across various fields. If plotted on a histogram, many datasets naturally form a symmetric bell shape.

    • It is the foundation of the Central Limit Theorem, which suggests that the distribution of sample means will tend toward a normal distribution regardless of the original population's distribution shape.

  • Theoretical Properties and Notation

    • Shape: Symmetric bell-shaped curve with high density in the center and fading tails.

    • Notation: XN(μ,σ2)X \sim N(\mu, \sigma^2).

    • Parameters:

      • The Mean (μ\mu): Determines the center of the distribution. Shifting the mean moves the entire curve along the x-axis (e.g., from 1010 to 2020 to 3030).

      • The Variance (σ2\sigma^2) or Standard Deviation (σ\sigma): Determines the variability or width of the curve. A larger standard deviation results in a wider, flatter curve, while a smaller standard deviation creates a narrower, taller curve. Standard deviation measures the tendency of data to deviate from the mean.

      • Conversion: To get standard deviation from variance, find σ2\sqrt{\sigma^2}. To get variance from standard deviation, find σ2\sigma^2.

  • The Probability Density Function (PDF) Formula

    • The mathematical form governing the bell shape is: f(x)=1σ2πe12(xμσ)2f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{1}{2}\left(\frac{x - \mu}{\sigma}\right)^2}.

    • Components: Includes the mathematical constant π\pi (3.141593.14159) and the natural base ee. Students are required to recognize this formula as the normal distribution PDF but generally do not solve it by hand.

Standardizing the Distribution (Z-Scores)

  • Integration and Closed Form Solutions

    • Finding the area under the normal curve theoretically requires integration: abf(x)dx\int_{a}^{b} f(x) \, dx. However, this integral has no closed-form solution (it cannot be solved with standard calculus formulas). It must be solved numerically via computers or calculators.

  • The Standard Normal Distribution (ZZ)

    • To simplify calculations, every normal distribution can be converted to the "Universal Scale" known as the Standard Normal Distribution.

    • Notation: ZN(0,1)Z \sim N(0, 1), where the mean is always 00 and the standard deviation is always 11.

    • Transformation Formula: z=xμσz = \frac{x - \mu}{\sigma}.

Calculating Probabilities Using the HP Calculator

  • The Procedure

    • Step 1: Convert the given xx value to a zz value using the standardization formula.

    • Step 2: Use the calculator to find the probability P(Xx)P(X \le x). The HP calculator is designed specifically to work with "less than" (\le) probabilities.

  • Calculator Key Sequences

    • To find probability from a Z-value:

      1. Enter the zz value (e.g., 0.120.12).

      2. Press the Blue Shift button.

      3. Press button 3 (indexed as "zed to p" or ZPZ \rightarrow P).

    • To adjust decimal display:

      1. Press the Orange Shift button (downward arrow).

      2. Press the Equals (==, Display) button.

      3. Type the desired number of decimals (e.g., 44).

  • Handling "Greater Than" (>) Problems

    • A "greater than" sign is a "reason for pause" because the calculator only provides "less than" area.

    • Complement Rule: Since the total area is 11, the probability of the right tail is calculated as: P(X > x) = 1 - P(X < x).

  • Equality Notation in Chapter 6

    • In discrete distributions (Chapter 5), the difference between < and \le is critical.

    • In continuous distributions (Chapter 6), they are effectively the same because the probability at any specific point is zero. Thus, P(X < 18) is indistinguishable from P(X18)P(X \le 18).

Example Case Study: Image Download Time

  • Scenario: XX represents the time in seconds to download an image file. XX is normally distributed with μ=18\mu = 18 seconds and σ=5\sigma = 5 seconds.

  • Problem A: Calculate P(X < 18.6)

    1. Convert to Z: z=18.6185=0.65=0.12z = \frac{18.6 - 18}{5} = \frac{0.6}{5} = 0.12.

    2. Calculator Entry: Input 0.120.12, press Blue Shift, then 3.

    3. Result: P(X < 18.6) = 0.5478. This represents the area to the left of 18.618.6 on the distribution curve.

  • Problem B: Calculate P(X > 18.6)

    1. Apply Complement Rule: P(X > 18.6) = 1 - P(X < 18.6).

    2. Substitution: 10.5478=0.45221 - 0.5478 = 0.4522.

Questions & Discussion

  • Question: How did you get more decimals on the calculator?

  • Response: Use the Orange Shift button, then the Equals button (which corresponds to "Display" in orange text), and then hit the number corresponding to the decimals you want (usually 44).

  • Question/Note on Tables: The lecturer demonstrated a standard ZZ table, showing that for z=0.12z = 0.12, you find the intersection of row 0.10.1 and column 0.020.02 to get 0.54780.5478.

  • Note Regarding Examinations: Traditional printed ZZ tables are being phased out. Students are explicitly required to have and use the HP calculator for exams, as it is also necessary for the second-semester course, Theory of Interest.