Introduction to Random Variables and Expected Values

Introduction to Random Variables and Probability Distributions

What are Random Variables?

  • A random variable is any numerical outcome where the exact result is uncertain.

  • It allows for making predictions about future events rather than merely describing past or present states.

  • Examples:

    • The number of passengers who don't show up for a flight.

    • The number of points scored in a sports championship.

    • The number of students attending a class (e.g., out of 65 enrolled, less than 65 are expected for any given session due to illness or other reasons).

  • Even though individual outcomes are uncertain, statistical insights and probability models can be applied.

Probability Distribution Function (PDF)

  • The probability distribution function (or probability mass function for discrete variables) defines the probability of each possible outcome of a random variable.

  • Example: Hyde Park Household Size

    • Scenario: A student needs moving help in Hyde Park and knocks on a neighbor's door.

    • The random variable is the number of people living in the house next door.

    • US census data provides probabilities for different household sizes:

      • P(X=1) = 28.0\% (one person)

      • P(X=2) = 33.6\% (two people)

      • P(X=3) = 15.5\% (three people)

      • (Implicitly, probabilities for 4, 5, ext{etc.} people also exist).

    • This distribution helps estimate the likelihood of getting help.

  • Visualization: Similar to a histogram, but the height of each bar represents the probability of an outcome, not just its count.

  • Notation:

    • Big X: Represents the random variable itself (e.g., number of people in a household).

    • Little x: Represents a specific value the random variable X can take (e.g., 1, 2, 3, ext{etc.} people).

Types of Random Variables

Discrete Random Variables
  • Definition: Outcomes can be listed or counted; there's a finite or countably infinite number of possibilities.

  • Key Criterion: Can you make a list of all possible outcomes?

  • Common Characteristic: Often (but not exclusively) whole numbers.

  • Examples:

    • Number of days in a year with recorded precipitation (0 to 365 options).

    • Count of tornado warnings (a fixed, finite number).

    • Room numbers (e.g., UTC room numbers are decimals, but finite and listable).

    • Shoe sizes (e.g., whole numbers and halves, but not arbitrary decimals).

  • Also referred to as a Probability Mass Function (PMF).

Continuous Random Variables
  • Definition: Outcomes can take any value within a given range; there are infinitely many possibilities, usually involving decimal places.

  • Key Characteristic: Can be refined indefinitely by adding more decimal places.

  • Examples:

    • Speed of a tennis ball (e.g., 70 mph, 70.1 mph, 70.15 mph, etc.).

    • Annual rain accumulation in inches (e.g., 120.1 inches, 120.15 inches, etc.).

Expected Values (E[X])

  • The expected value of a random variable X is its long-run average value.

  • It's a weighted average, where each possible outcome is weighted by its probability of occurrence.

  • Contrast with Simple Average: A simple average (arithmetic mean) assumes all outcomes are equally likely, which is rarely true in real-world probability distributions (e.g., household sizes). The expected value accounts for varying probabilities.

  • Formula for Discrete Random Variables:
    E[X] = ext{Sum over all possible values of } x (x imes P(X=x))
    E[X] = oldsymbol{ ext{E}}[X] = oldsymbol{ ext{x}} imes oldsymbol{ ext{P}}(oldsymbol{ ext{X}}=oldsymbol{ ext{x}})

  • For Continuous Variables: The concept is the same but involves calculus (integration) instead of summation.

Business Use Cases for Expected Values
  1. Life and Disability Insurance Company

    • Scenario: A company with 1,000 customers.

    • Events and Payouts:

      • Death: Probability 1/1000 = 0.001, Payout \text{USD } 100,000

      • Disability: Probability 2/1000 = 0.002, Payout \text{USD } 50,000

      • No event: Probability 997/1000 = 0.997, Payout \text{USD } 0

    • Expected Payout per Customer:
      E[ ext{Payout}] = (0.001 imes 100,000) + (0.002 imes 50,000) + (0.997 imes 0)
      E[ ext{Payout}] = 100 + 100 + 0 = ext{USD } 200

    • Implication: To make a profit, the company must charge customers more than \text{USD } 200 per year in premiums.

  2. Bookstore Sales

    • Expected value of books a customer buys helps estimate sales, stock bags, and guide store decisions.

    • Provides a