Continuous Probability Distributions Notes

Continuous Probability Distributions

Outline of Chapter 7

  • Review of Probability Theory Concepts
  • Definition of Random Variables
  • Exploration of Discrete Variables
  • Introduction to Discrete Probability Distributions
  • Introduction to Continuous Distributions
  • Characteristics of Continuous Probability Distributions
  • Definition of Uniform Continuous Distribution
  • Understanding Normal Distribution
  • Introduction to Standard Normal Distribution
  • Normal Approximation Techniques

Probability vs. Statistics

  • Statistics: Involves prediction and estimation based on data.
  • Probability: Addresses the likelihood of events occurring based on known distributions.
  • Flow:
  • Data (Samples)
  • Model (Distribution)

Review of Data Types

  • Be cautious about rounding continuous data to integer values as this introduces ambiguity.
  • Types of Data:
  • Qualitative: Categorical or verbal labels.
  • Quantitative: Numeric values.
    • Discrete: Countable and distinct values (e.g., number of heads in coin toss).
    • Continuous: Infinite possibilities, often measured (e.g., height, weight).

Discrete Random Variables

  • Definition: A function assigning numerical values to each outcome in a random experiment.
  • Nomenclature:
  • Uppercase letters (e.g., X, Y) represent random variables.
  • Lowercase letters (e.g., x, y) represent outcomes of random variables.
  • Types:
  • Finite set (e.g., number of heads while tossing n coins).
  • Infinite countable set (e.g., number of trials until the first success).

Discrete Probability Distributions

  • Assigns probability to each value of a discrete variable.
  • Conditions for Valid Distribution:
  1. Probabilities must be between 0 and 1.
  2. The sum of all probabilities must equal 1.

Distribution Functions

  • Probability Distribution Function (PDF): Highlights the probability of specific values or intervals.
  • Cumulative Distribution Function (CDF): The sum of probabilities adding from the smallest to the largest value gradually approaching 1.
Expected Value
  • Defined as the mean of a discrete random variable, calculated as the sum of each value multiplied by its probability.
Variance
  • Represents a measure of variability around the expected value, calculated as the average of the squared deviations.

Continuous Random Variables

  • Use intervals for probabilities (e.g., P(a < X < b)).
  • Defined area under a curve in a probability density function (PDF).
Probability Density Function and Cumulative Distribution Function
  • PDF Characteristics:
  • Non-negative and total area under the curve equals 1.
  • Parameters determine mean, variance, and shape.
  • CDF Characteristics: Shows the cumulative percentages and ranges.

Uniform Continuous Distribution

  • Simplest continuous distribution where all outcomes are equally likely.
  • Represented as U(a, b) between bounds a and b.
  • Mean = (a+b)/2; Standard deviation = (b-a)/√12.
Example:
  • A painkiller's effectiveness: Duration follows a uniform distribution U(15, 30). Calculate probabilities using area under the PDF.

Normal Distribution

  • Also known as Gaussian distribution; characterized by mean (µ) and standard deviation (σ) denoted as N(µ, σ).
  • Properties: symmetric, unimodal curve, approximately 99.7% of data falls within three standard deviations of the mean.
Standard Normal Distribution
  • Special case where µ = 0 and σ = 1.
  • Z-scores can be used for transformation and probability calculations.
Empirical Rule
  • Provides approximate percentages of data within standard deviations from the mean.
  • k=1: 68.26% within µ ± σ
  • k=2: 95.44% within µ ± 2σ
  • k=3: 99.73% within µ ± 3σ

Finding Areas with Z-scores

  • Apply z-tables for cumulative probability calculations.
  • Cumulative probabilities can be derived using Excel functions for clarity and ease.

Inverse Normal Distribution

  • Determines the value (X) associated with a given cumulative probability.
  • For example, finding the 10th percentile in normally distributed scores (µ=75, σ=7) results in a score that requires retaking the exam if it falls below a calculated threshold (e.g., 66).

Conclusion

  • Understanding and correctly implementing concepts related to continuous probability distributions are critical for statistical analysis and real-world applications.