Chapter 4: Probability Theory Basics — Random Experiments, Sample Space, and Counting Rules

Random Experiments and Probability Basics

  • This transcript introduces inferential statistics as a transition from frequency distributions to probability distributions.
  • Probability is described as distributing the “ball of clay” (the total probability mass, equal to 1) across all possible outcomes of an experiment.
  • The ball of clay metaphor helps visualize how probability mass is allocated to all potential events that could occur.
  • Emphasis on the distinction between frequency distributions (describing observed data) and probability distributions (describing all possible outcomes and their probabilities).
  • Chapter four focuses on basic probability theory before diving into probability distributions in later chapters; Bayes’ theorem is noted as more challenging but potentially fun.

Key Definitions and Concepts

  • Random experiment: A process that generates well-defined outcomes. Example: Opening a lemonade stand and counting how many lemonades are sold.
  • Outcome: A single result that can occur from an experiment (e.g., “sold 8 lemonades”).
  • Sample point: One of the outcomes that could happen in the experiment.
  • Sample space: The set of all possible outcomes (all sample points).
  • Event: A subset of the sample space (a collection of sample points that share a property, e.g., getting at least two heads in three coin flips).
  • Probability distribution vs frequency distribution:
    • Frequency distribution describes observed frequencies.
    • Probability distribution assigns probabilities to all possible outcomes and their likelihoods; the sum of all probabilities is 1.
  • Probabilities are bounded: a probability must lie between 0 and 1 (inclusive). 0 corresponds to 0% chance, 1 corresponds to 100% chance.
  • Notation recap:
    • Intersection: P(A \cap B)
    • Conditional probability: P(B|A) (probability of B given A)
    • Event vs sample point: An event is a set of sample points; a sample point is a single outcome.
    • Sum of probabilities: \sum P(s_i) = 1 over all sample points in the sample space.
    • Membership: The symbol \in means “is an element of” (e.g., an outcome that belongs to an event).

Basic Probability Rules (Axioms and Notation)

  • Probability bounds: For every sample point si, 0 \le P(si) \le 1
  • Normalization: The probabilities across all sample points sum to 1:
    \sum{i} P(si) = 1
  • Event probability: The probability of an event E (a subset of the sample space) is the sum of the probabilities of the sample points in E:
    P(E) = \sum{si \in E} P(s_i)
  • Conditional probability intuition: The probability of B given A focuses on the portion of the sample space where A occurred:
    P(B|A) = \frac{P(A \cap B)}{P(A)} \quad \text{(provided } P(A) > 0)
  • The little line in the transcript denotes “given that” in conditional probability.
  • The asterisk/ summation symbol: \Sigma (capital sigma) denotes summation over a set of terms.

Simple Probability Examples

  • Fair die (six-sided): sample space ({1,2,3,4,5,6}), each with probability P(i) = \frac{1}{6}
    • Probability of rolling less than 3:
      P(\text{<3}) = \frac{2}{6} = \frac{1}{3}
    • Probability of rolling less than 4:
      P(\text{<4}) = \frac{3}{6} = \frac{1}{2}
    • Probability of rolling greater than 4:
      P(\text{>4}) = \frac{2}{6} = \frac{1}{3}
  • Three fair coins flipped:
    • Sample space size: 2^3 = 8 outcomes.
    • Event: getting at least two heads (i.e., 2 or 3 heads).
    • There are 4 favorable outcomes (HHH, HHt, HtH, tHH). Hence:
      P(\text{at least 2 heads}) = \frac{4}{8} = \frac{1}{2}
  • A two-step experiment: coin flip (2 outcomes) followed by a die roll (6 outcomes)
    • Total outcomes: 2 \times 6 = 12
    • Example: probability of rolling two sixes with two dice is computed as a separate case (1 favorable outcome out of 36 total outcomes for two dice): P(\text{double six}) = \frac{1}{36}
    • Counts illustrate the multistep counting rule below.

Counting Rules: Multistep Experiments

  • For a k-step experiment with NI possible results at step i, the total number of possible outcomes is the product:
    \text{Total outcomes} = \prod{i=1}^{k} Ni
  • Examples:
    • Flip three coins: each step has 2 outcomes ⇒ total outcomes = 2^3 = 8
    • Flip four coins: total outcomes = 2^4 = 16
    • Coin then die: first step has 2 outcomes, second step has 6 outcomes ⇒ total outcomes = 2\times 6 = 12
  • Key takeaway: Counting rules let us convert a story into a count of equally likely outcomes, which then allows probability calculations.

Counting Rules: Combinations (n choose k)

  • Problem: From a set of capital N objects, how many distinct groups of size little n can be formed when order does not matter?
  • Notation (two equivalent forms):
    • English-language form: how many groups of \text{little } n from \text{capital } N
    • Mathematical shorthand: \binom{N}{n} or (N\choose n)
  • Definition via factorials:
    \binom{N}{n} = \frac{N!}{n!(N-n)!}
  • Important related concepts:
    • Factorial: N! = N \times (N-1) \times \cdots \times 2 \times 1
    • Special case: 0! = 1
  • Worked example: 5 objects choosing 2 at a time
    • \binom{5}{2} = \frac{5!}{2!\cdot (5-2)!} = \frac{120}{2\cdot 6} = 10
  • Practical card example: How many 5-card hands from a standard deck?\binom{52}{5}
  • Real-world-story example (20 kids, 5 to form a team):
    • Number of possible 5-person groups: \binom{20}{5} = 15504
    • If one specific set of 5 kids (your 5 kids) is just one of these possibilities, the probability that your 5 kids are on the randomly formed team is:
      P(\text{your 5 kids on the team}) = \frac{1}{\binom{20}{5}} = \frac{1}{15504} \approx 6.45\times 10^{-5}
  • Quick tips mentioned in the transcript:
    • There are two common ways to write the combination notation; both convey the same concept.
    • Use factorial simplifications to cancel terms when calculating large combinations.

Putting It Together: What This Lets You Do

  • You can turn a narrative problem into a probability problem by:
    • Identifying the experiment and its steps, the sample space, and the sample points.
    • Using the multistep counting rule to determine the total number of outcomes.
    • Defining the event of interest and summing the probabilities of the relevant sample points (or using combinations if each outcome is equally likely).
  • In many problems, you’ll use: P(E) = \sum{si \in E} P(s_i) when outcomes are not equally likely, or
    • If outcomes are equally likely, P(E) = \frac{\text{number of favorable outcomes}}{\text{total number of outcomes}} = \frac{|E|}{|S|} where S is the sample space.

Connections to Other Topics and Practical Implications

  • Relationship to inferential statistics: This chapter lays the groundwork for distributing probability across all possible events, which is essential when generalizing from a sample to a population.
  • The contrast with frequency distributions: Frequency distributions describe observed data; probability distributions describe the likelihood of various outcomes in a theoretical or long-run sense.
  • Prelude to probability distributions in later chapters (chapters 5–7) and Bayes’ theorem (often considered more challenging but illuminating).
  • Foundational math tools needed:
    • Factorials and combinations (n!), (N choose n) notation
    • Basic probability axioms and conditional probability
  • Real-world relevance: Examples include predicting purchases in a store, evaluating outcomes of games of chance, and understanding how likely certain team selections are when groups are formed randomly.

Practical Tips and Tutor Notes

  • Memorize the core formulas and the logic behind them, but expect to derive them in problems rather than memorize every detail.
  • If a quiz or test offers a formula sheet, recognize the underlying concepts rather than rote memorization of the sheet.
  • When evaluating problems, distinguish between:
    • Word problems requiring counting (use counting rules)
    • Problems with non-uniform probabilities (directly sum probabilities of relevant sample points)
  • Common pitfalls to watch for:
    • Misreading inequalities (e.g.,
    • Mixing up order in combinations vs. permutations (the combination formula assumes order does not matter)
    • Assuming all sample points are equally likely unless stated otherwise

Ethical and Practical Implications Mentioned in the Transcript

  • The instructor humorously points out that achieving a perfect score on the first attempt of a quiz might indicate looking up answers (Chegg) rather than doing the work.
  • Two attempts are allowed to encourage practice and learning rather than reliance on answer keys.
  • The approach emphasizes building understanding and the use of a formula sheet as a guide, not a substitute for learning.

Quick Reference Formulas (LaTeX)

  • Probability of B given A: P(B|A) = \frac{P(A \cap B)}{P(A)}
  • Sum of probabilities over sample space: \sum{i} P(si) = 1
  • Event probability: P(E) = \sum{si \in E} P(s_i)
  • Combination: \binom{N}{n} = \frac{N!}{n!(N-n)!}
  • Factorial: N! = N \times (N-1) \times \cdots \times 2 \times 1, \quad 0! = 1
  • Example probabilities with a die:
    • P(\text{<3}) = \frac{2}{6} = \frac{1}{3}
    • P(\text{<4}) = \frac{3}{6} = \frac{1}{2}
    • P(\text{>4}) = \frac{2}{6} = \frac{1}{3}
  • Example: At least two heads with three fair coins:
    • Sample space size: 2^3 = 8
    • Favorable outcomes: 4
    • Hence: P(\text{at least 2 heads}) = \frac{4}{8} = \frac{1}{2}
  • Example: 20 choose 5 (team selection) and probability:
    • Total groups: \binom{20}{5} = 15504
    • Favorable outcome: 1 (your specific 5 kids)
    • Probability: P = \frac{1}{\binom{20}{5}} = \frac{1}{15504}