Continuous Probability Distributions Notes
Continuous Probability Distributions
Outline of Chapter 7
- Review of Probability Theory Concepts
- Definition of Random Variables
- Exploration of Discrete Variables
- Introduction to Discrete Probability Distributions
- Introduction to Continuous Distributions
- Characteristics of Continuous Probability Distributions
- Definition of Uniform Continuous Distribution
- Understanding Normal Distribution
- Introduction to Standard Normal Distribution
- Normal Approximation Techniques
Probability vs. Statistics
- Statistics: Involves prediction and estimation based on data.
- Probability: Addresses the likelihood of events occurring based on known distributions.
- Flow:
- Data (Samples)
- Model (Distribution)
Review of Data Types
- Be cautious about rounding continuous data to integer values as this introduces ambiguity.
- Types of Data:
- Qualitative: Categorical or verbal labels.
- Quantitative: Numeric values.
- Discrete: Countable and distinct values (e.g., number of heads in coin toss).
- Continuous: Infinite possibilities, often measured (e.g., height, weight).
Discrete Random Variables
- Definition: A function assigning numerical values to each outcome in a random experiment.
- Nomenclature:
- Uppercase letters (e.g., X, Y) represent random variables.
- Lowercase letters (e.g., x, y) represent outcomes of random variables.
- Types:
- Finite set (e.g., number of heads while tossing n coins).
- Infinite countable set (e.g., number of trials until the first success).
Discrete Probability Distributions
- Assigns probability to each value of a discrete variable.
- Conditions for Valid Distribution:
- Probabilities must be between 0 and 1.
- The sum of all probabilities must equal 1.
Distribution Functions
- Probability Distribution Function (PDF): Highlights the probability of specific values or intervals.
- Cumulative Distribution Function (CDF): The sum of probabilities adding from the smallest to the largest value gradually approaching 1.
Expected Value
- Defined as the mean of a discrete random variable, calculated as the sum of each value multiplied by its probability.
Variance
- Represents a measure of variability around the expected value, calculated as the average of the squared deviations.
Continuous Random Variables
- Use intervals for probabilities (e.g., P(a < X < b)).
- Defined area under a curve in a probability density function (PDF).
Probability Density Function and Cumulative Distribution Function
- PDF Characteristics:
- Non-negative and total area under the curve equals 1.
- Parameters determine mean, variance, and shape.
- CDF Characteristics: Shows the cumulative percentages and ranges.
- Simplest continuous distribution where all outcomes are equally likely.
- Represented as U(a, b) between bounds a and b.
- Mean = (a+b)/2; Standard deviation = (b-a)/√12.
Example:
- A painkiller's effectiveness: Duration follows a uniform distribution U(15, 30). Calculate probabilities using area under the PDF.
Normal Distribution
- Also known as Gaussian distribution; characterized by mean (µ) and standard deviation (σ) denoted as N(µ, σ).
- Properties: symmetric, unimodal curve, approximately 99.7% of data falls within three standard deviations of the mean.
Standard Normal Distribution
- Special case where µ = 0 and σ = 1.
- Z-scores can be used for transformation and probability calculations.
Empirical Rule
- Provides approximate percentages of data within standard deviations from the mean.
- k=1: 68.26% within µ ± σ
- k=2: 95.44% within µ ± 2σ
- k=3: 99.73% within µ ± 3σ
Finding Areas with Z-scores
- Apply z-tables for cumulative probability calculations.
- Cumulative probabilities can be derived using Excel functions for clarity and ease.
Inverse Normal Distribution
- Determines the value (X) associated with a given cumulative probability.
- For example, finding the 10th percentile in normally distributed scores (µ=75, σ=7) results in a score that requires retaking the exam if it falls below a calculated threshold (e.g., 66).
Conclusion
- Understanding and correctly implementing concepts related to continuous probability distributions are critical for statistical analysis and real-world applications.