Week 6 - Comprehensive Study Notes on Discrete Random Variables
The Role and Impact of Random Variables
- Conceptual Foundations: Why Random Variables Matter
- Quantifying Uncertainty: Random variables serve to assign numerical values to random outcomes, which transforms abstract uncertainty into a measurable and manageable format.
- Computing Probabilities: By utilizing probability distributions, it becomes possible to determine the likelihood of specific events, enabling precise calculations rather than estimations.
- Predictions and Decision-Making: These tools allow for the forecasting of outcomes and the rigorous assessment of risk, which directly supports and enhances high-level decision-making processes.
- Modeling Real-World Phenomena: They provide a mathematical framework to represent the complexities of reality through structured distributions, forming the bedrock of statistical inference and modern data science.
- Foundations for Expectation: They define the "expected value," which is crucial for understanding the long-term behavior of a system or process.
- Measuring Variability and Risk: Through metrics like variance and standard deviation, random variables quantify the volatility and spread of potential outcomes.
Learning Objectives for Discrete Probability Distributions
- Relational Understanding: Students must understand the relationship between specific outcomes and their associated probabilities within a distribution.
- Descriptive Methods: Proficiency is required in describing discrete probability distributions using three distinct formats:
- Tables: Listing individual outcomes alongside their probabilities.
- Graphs: Visual representations of probability mass functions.
- Formulas: Mathematical expressions (PMFs) that define the distribution.
- Analytical Computation: Students must know how to compute and interpret the Expected Value and Standard Deviation of these distributions.
Fundamental Definitions and Classifications
- Random Variable (X): A numerical description of the outcome resulting from an experiment.
- Discrete Random Variable:
- Defined by its ability to assume either a finite number of values or an infinite sequence of countable values.
- Examples:
- Hotel Reservations: Making 100 reservations; X=number of guests who show up. Possible outcomes: 0,1,2,...ˉ,98,99,100.
- Restaurant Operations: Running a restaurant for one day; X=number of customers. Possible outcomes: 0,1,2,3,...ˉ.
- Room Sales: Choosing a hotel room; X=1 if it has a sea view, X=0 if it does not.
- Continuous Random Variable:
- Defined by its ability to assume any numerical value within an interval or a collection of intervals.
- Example:
- Hotel Reception: X=time between two consecutive customer arrivals, in minutes. Possible outcomes: x β 0.
The Discrete Probability Distribution
- Definition: This distribution describes how probabilities are assigned across the possible values of a discrete random variable.
- Representations:
- Probability Mass Function (pmf): Denoted as f(x), it provides the probability for each specific value of the random variable.
- Alternative Notation: P(X=x), which explicitly states the probability that the variable X takes the value x.
- Mathematical Requirements: For a discrete random variable, the following conditions must always be satisfied:
- Individual probabilities must be non-negative: f(x)β0 and P(X=x)β0.
- The sum of all probabilities in the sample space must equal exactly one: SUM(f(x))=1 or SUM(P(X=x))=1.
- Experiment: Rolling one standard six-sided die.
- Random Variable (X): The number of dots on the top face.
- Possible Values (x): 1,2,3,4,5,6.
- Probability Representation:
- Table: Every outcome (1 through 6) has an identical probability of P(X=x)=61.
- Graph: A bar chart where each outcome from 1 to 6 reaches the height of approximately 0.167.
- Formula: f(x)=n1, where n is the number of possible outcomes. For a die, f(x)=61.
Case Study: Distribution of the Sum of Two Dice
- Experiment: Rolling two standard dice simultaneously.
- Random Variable (X): The sum of the number of dots on the top faces of both dice.
- Possible Values: Values range from 2 (rolling two 1s) to 12 (rolling two 6s), specifically: 2,3,4,5,6,7,8,9,10,11,12.
Measures of Central Tendency: Expected Value
- Theoretical Concept: The expected value is the theoretical long-term average of a random variable. It does not necessarily represent an outcome that will occur in a single trial.
- Formula:
- E[X]=SUM(xi×P(X=xi))
- Excel Implementation: Use the function
SUMPRODUCT(values; probabilities). - Practical Use: This is a vital tool for decision-making under uncertainty, as it only requires knowledge of probabilities and outcomes, not historical data.
- Mathematical Properties:
- Linearity: E[aX+b]=a×E[X]+b
- Additivity: E[X+Y]=E[X]+E[Y]
Real-World Example: Stock Decisions Based on Expected Value
- Scenario: A stock trades at 250. Overnight results will determine tomorrow's price.
- Good News (60% probability): Price rises to 255 (Profit = +5).
- Bad News (40% probability): Price falls to 240 (Profit = −10).
- Expected Price Calculation:
- 255×0.60+240×0.40=153+96=249
- Expected Profit Calculation:
- 5×0.60+(−10)×0.40=3−4=−1
- Decision: Based on an expected profit of −1, you should not buy the stock.
The Gambler’s Fallacy: Roulette Example
- Observation: The last 6 trials at a roulette table resulted in: 2,32,14,27,30,21. Out of these, 2 outcomes were "Black."
- Question: What is the most likely number of "Black" outcomes considering a total of 8 spins (the past 6 results plus 2 future spins)?
- Analysis of Future Spins: For the next 2 spins, there are four possible combinations, each with a probability of 0.25 (0.5×0.5):
- Red, Red (0 Blacks): 25%
- Red, Black / Black, Red (1 Black): 50%
- Black, Black (2 Blacks): 25%
- Conclusion: The most likely outcome for the future spins is 1 "Black." Added to the 2 that already occurred, the most likely total is 3.
Measures of Variability: Variance and Standard Deviation
- Variance Formula (Var(X)):
- Var(X)=SUM((xi−E[X])2×P(X=xi))
- Excel Implementation:
SUMPRODUCT((values - expected_value)^2; probabilities).
- Standard Deviation Formula (̓(X)):
- ̓(X) = \text{SQRT}(Var(X))
- Use and Interpretation:
- Measures the dispersion of outcomes around the expected value.
- Standard Deviation is expressed in the same units as the random variable (X), whereas Variance is in squared units.
- Additivity Property:
- Var(X+Y)=Var(X)+Var(Y)
Summary of Common Discrete Probability Distributions
| Distribution | Probability Function | Conditions of Use | Hospitality Example | Mean | Variance |
|---|
| Discrete Uniform | P(X=x)=n1 | Finite set of outcomes where all outcomes are equally likely and independent. | Hotel gives one free upgrade randomly to one of 5 guests. | E[X]=2x1+xn | Var(X)=12n2−1 |
| Bernoulli | P(X=1)=p, P(X=0)=1−p | Single trial with only two outcomes (Success/Failure) and constant probability p. | A guest shows up (1) or does not show up (0). | E[X]=p | Var(X)=p×(1−p) |
| Binomial | P(X=x)=(xn)px(1−p)n−x | Fixed number of independent trials (n) with two outcomes and constant probability p. | Number of guest shows among 20 guest bookings. | E[X]=n×p | Var(X)=n×p×(1−p) |
| Poisson | P(X = x) = \frac{e^{- ̓} ̓^x}{x!} | Counts events in a fixed interval of time or space; events are independent and occur at a constant rate ̓. | Number of complaints received per day. | E[X] = ̓ | Var(X) = ̓ |