Notes on Probabilistic Outcomes, Conditional Probability, Bayes' Theorem, and Contingency Tables

Probability Basics and Conditional Probability

  • Probabilistic outcomes organize how likely different results are when you have uncertainty. Key ideas include general probability, conditional probability, and how to combine outcomes across sequences of events.

  • Notation you’ll see:

    • $P(A)$: probability of event A.

    • $P(A|B)$: probability of A given that B has occurred (conditional probability).

    • $P(A\cap B)$: probability that both A and B occur (joint probability).

    • For a sequence of independent events, joint probability can be written as a product of individual probabilities.

  • Basic coin-toss example (two outcomes per toss):

    • A single fair coin has two outcomes: heads (H) or tails (T). So $P(H)=\frac{1}{2}$ and $P(T)=\frac{1}{2}$.

    • If you toss the coin twice, there are four equally likely sequences: HH, HT, TH, TT, each with probability $\frac{1}{4}$.

    • First toss outcome and second-toss conditional probabilities:

    • If the first toss is heads, the second toss can be H or T: $P(H2|H1)=\frac{1}{2}$ and $P(T2|H1)=\frac{1}{2}$.

    • If the first toss is tails, the second toss can be H or T: $P(H2|T1)=\frac{1}{2}$ and $P(T2|T1)=\frac{1}{2}$.

    • These conditional probabilities sum to 1 for a given first outcome: e.g., $P(H2|H1)+P(T2|H1)=1$ and similarly for $T_1$.

  • Independence and conditional vs joint probability:

    • If events A and B are independent, then $P(A\cap B)=P(A)P(B)$.

    • In general (not assuming independence), $P(A\cap B)=P(A)P(B|A)$.

    • The joint probability for a sequence can be written as a product of conditional probabilities, e.g. for two flips:
      P(H1\cap H2)=P(H1)\,P(H2|H_1).

    • For independent events, $P(H1)P(H2|H1)=P(H1)P(H_2)$, so the choice of which rule to use doesn’t change the result.

  • Joint probabilities for a sequence (examples):

    • Sequence HH: $P(H1\cap H2)=P(H1)P(H2|H_1)=\tfrac{1}{2}\cdot\tfrac{1}{2}=\tfrac{1}{4}$.

    • Sequence HT: $P(H1\cap T2)=P(H1)P(T2|H_1)=\tfrac{1}{2}\cdot\tfrac{1}{2}=\tfrac{1}{4}$.

    • Sequence TH: $P(T1\cap H2)=P(T1)P(H2|T_1)=\tfrac{1}{2}\cdot\tfrac{1}{2}=\tfrac{1}{4}$.

    • Sequence TT: $P(T1\cap T2)=P(T1)P(T2|T_1)=\tfrac{1}{2}\cdot\tfrac{1}{2}=\tfrac{1}{4}$.

    • The sum of all joint probabilities equals 1: $\tfrac{1}{4}+\tfrac{1}{4}+\tfrac{1}{4}+\tfrac{1}{4}=1$.

  • Why conditional probabilities matter:

    • Conditional probabilities describe how likely outcomes are when you already know something about the past. They are often necessary in decision making when outcomes depend on prior events.

    • A middle branch (the conditional probability) in a tree diagram represents the probability of a second event given the first event.

    • The tree diagram helps organize thoughts and compute probabilities across sequences, including when there are different numbers of outcomes at each step (not just heads/tails, but also dice, cards, etc.).

  • A practical framework: Bayes’ theorem, prior and likelihood, updating beliefs

    • Bayes’ theorem provides a principled way to update the probability of a hypothesis after observing new evidence.

    • The general idea: start with a prior probability for a hypothesis, multiply by the likelihood of the new evidence under that hypothesis, and normalize by the total probability of the evidence under all hypotheses.

    • The airport-baggage example (illustrative numbers):

    • Let F be the event “bag contains a forbidden item.” Prior: $P(F)=0.05$ (5%). Then $P(
      eg F)=0.95$.

    • Alarm given a forbidden item: $P(Alarm|F)=0.98$ (true positive).

    • Alarm given no forbidden item: $P(Alarm|
      eg F)=0.08$ (false positive).

    • We want the posterior $P(F|Alarm)$:
      P(F|Alarm)=\frac{P(Alarm|F)P(F)}{P(Alarm|F)P(F)+P(Alarm|\neg F)P(\neg F)}.
      Substituting numbers: P(F|Alarm)=\frac{0.98\cdot 0.05}{0.98\cdot 0.05+0.08\cdot 0.95}\approx\frac{0.049}{0.049+0.076}=0.392.

    • Interpretation: After the alarm sounds, about 39.2% of such alarms correspond to bags with a forbidden item (posterior belief), rather than the prior 5%.

    • Bayes’ theorem key form (posterior):
      P(F|Alarm)=\frac{P(Alarm|F)\,P(F)}{P(Alarm|F)\,P(F)+P(Alarm|\neg F)\,P(\neg F)}.

    • The idea of updating beliefs when new evidence arrives is called posterior updating; Bayes’ rule provides the exact mechanism.

    • Important notes:

    • The prior $P(F)$ encodes what you believed before seeing the alarm.

    • The likelihoods $P(Alarm|F)$ and $P(Alarm|\neg F)$ encode how informative the alarm is about the presence of a forbidden item.

    • As the evidence becomes more reliable (likelihoods move toward 0 or 1), the posterior moves more away from the prior.

  • Contingency tables, joint/marginal/conditional probabilities

    • A contingency table is a two-way table that tabulates counts for two nominal (categorical) variables, for example, favorite winter sport and college type.

    • Key concepts:

    • Joint probability: the probability that both category A and category B occur. In a table with total N, a cell count c corresponds to joint probability $P(A=a, B=b)=\frac{c}{N}$.

    • Marginal (overall) probabilities: probabilities of a single variable when summing across the other variable (row totals or column totals divided by N).

    • Conditional probability: the probability of one variable given a fixed value of the other, e.g. P(A=a|B=b)=\frac{P(A=a, B=b)}{P(B=b)}.

    • How to read a contingent table:

    • For a fixed row (e.g., four-year college), conditional probabilities are taken by dividing the counts in that row by the row total (or by the relevant column total if conditioning on the other variable).

    • For a fixed column (e.g., skiing), conditional probabilities are taken by dividing by the column total.

    • Example outlines (data summarized, not copied exactly from the transcript):

    • Suppose a survey of 545 students asks for favorite winter sport and college type. The joint counts fill a table; from there you can compute:

      • $P( ext{Skiing})=\frac{\text{# skiing}}{545}$ (a joint probability across college types if you sum across college types).

      • $P(\text{FourYear} | \text{IceSkating})=\frac{\text{# FourYear and IceSkating}}{\text{# IceSkating}}$ (a conditional probability).

      • $P(\text{IceSkating} | \text{FourYear})=\frac{\text{# FourYear and IceSkating}}{\text{# FourYear}}$ (the other direction).

    • Converting between formats:

    • The joint probabilities are the counts divided by the grand total (N).

    • Conditional probabilities are the joint divided by the relevant marginal (row or column total).

    • You can form a tree diagram from contingency table data by treating each level as a branch and deriving conditional probabilities from the appropriate marginals.

    • Marginal vs conditional probabilities

    • Marginal probability refers to the probability of a single variable, ignoring the other (a sum across a row or column when normalizing by N).

    • Conditional probability isolates a row or a column and normalizes by that row/column total.

  • Worked outline: turning a contingency table into a tree diagram (practice outline)

    • Start with the grand total N.

    • Create branches for the first variable (e.g., University A vs University B) with probabilities equal to their marginal proportions.

    • For each branch, create sub-branches for the second variable’s categories with conditional probabilities given the first branch.

    • Compute joint probabilities as products of the marginal and conditional probabilities along each path.

    • Ensure that each set of branches from a node sums to 1 (probabilities along that node's outgoing branches).

    • This framework lets you answer questions about both joint and conditional probabilities, and it provides intuition for the equivalence between a tree and a contingency table.

  • Practical example: a headline-like comprehension problem using a two-university table (structure shown, not exact numbers)

    • Setup: counts of graduates by university (A, B) and by income bracket (<20k, 20–39k, 40k+).

    • Steps:

    • Compute marginal proportions for each university (e.g., $P(A)$, $P(B)$).

    • Compute conditional proportions in each university for each income bracket (e.g., $P(\text{<20k}|A)$, etc.).

    • Compute joint probabilities by multiplying marginals by the corresponding conditionals.

    • Check that the sum of all joint probabilities equals 1.

    • Note: you may also present the same information by swapping the conditioning order (e.g., condition on income bracket first). The joint probabilities stay the same; the conditional probabilities will look different depending on the conditioning variable, but the math is consistent.

  • Practice problems and key strategies highlighted in the session

    • Inclusion–exclusion for unions of events:

    • For two events A and B: P(A\cup B)=P(A)+P(B)-P(A\cap B).

    • Example: If $P(A)=0.8$, $P(B)=0.6$, and $P(A\cap B)=0.5$, then P(A\cup B)=0.8+0.6-0.5=0.9. The complement probability is $1-0.9=0.1$.

    • Sequential probability without replacement (dependent events):

    • If you have a batch with a certain defect rate, the probability of successive defects depends on prior outcomes.

    • Example with defectives without replacement (typical numbers): if $P(D1)=0.40$ and there are 39 defectives left out of 99 items after the first defect, then $P(D2|D1)=\frac{39}{99}$ and P(D1\cap D2)=0.40\cdot\frac{39}{99}\approx 0.1576. If instead sampling with replacement, $P(D2|D1)=P(D2)=0.40$.

    • Probability of getting all correct on a multiple-choice test: if each question has 4 options, and you guess, then for 10 questions
      P( ext{all correct})=\left(\frac{1}{4}\right)^{10}=\frac{1}{4^{10}}=\frac{1}{1{,}048{,}576}.

    • General guidance on numerical representations in exams:

    • Use fractions when numbers are given as fractions; use decimals if provided as decimals.

    • Do not convert decimals to fractions unless instructed; decimals are fine to work with.

    • Computation tips and exam strategy:

    • Keep expressions in exact fractions when possible; simplify where reasonable.

    • Expect a mix of problems: simple probability, conditional/probability trees, Bayes’ updates, and contingency-table interpretations.

    • Practice with a variety of example formats to build fluency with joint/conditional/marginal probabilities and transitions between representations.

  • Quick recap of essential formulas to memorize

    • Conditional probability: P(A|B)=\frac{P(A\cap B)}{P(B)}.

    • Joint probability for independent events: P(A\cap B)=P(A)P(B). For dependent events: P(A\cap B)=P(A)P(B|A).

    • Bayes’ theorem (posterior): P(F|Alarm)=\frac{P(Alarm|F)\,P(F)}{P(Alarm|F)\,P(F)+P(Alarm|\neg F)\,P(\neg F)}.

    • Inclusion–exclusion for two events: P(A\cup B)=P(A)+P(B)-P(A\cap B).

    • Tree-diagram construction: joint probability along a path equals the product of conditional probabilities along that path.

  • Exam preparation takeaway

    • Practice constructing and reading both trees and contingency tables.

    • Be comfortable switching between joint, marginal, and conditional representations.

    • Use Bayes’ theorem to update beliefs when new evidence arrives (posterior probabilities).

    • Expect straightforward application problems (coin tosses, alarm examples), probability with multiple events, and simple contingency-table interpretations.