Discrete Probability & Distributions Comprehensive Study Guide

Units 2A & 2B: Discrete Probability Notes

Introduction to Discrete Probability

Three Branches of Probability and Statistics:
- Descriptive Statistics
- Inferential Statistics
- Probability
Definition of Probability: Probability is used to draw conclusions about the likelihood that an arbitrary sample will have certain characteristics given information about a known population it is drawn from. It assigns a numerical value from 0 to 1 inclusive to an event to measure the degree of uncertainty.
Key Characteristics:
- Works from a known population or historical data.
- Direction: Population (\rightarrow) Sample (Inverse of Inferential Statistics).
- Scale: The closer the probability is to 1, the more likely the event is to occur; the closer to 0, the less likely.
- Forms: Decimal, Fraction, Percentage, or Ratio.

Basic Probability Concepts

Probability Experiment: A process or activity that has a measurable outcome each time it is repeated. The outcome might vary from trial to trial and is typically unknown in advance.
Trial: Performing the experiment one time.
Simple Event: The most basic individual outcome of an experiment.
Sample Space (S): The set of all possible simple events.
Event: A subset of the sample space.
Power Set (P(S)): The set of all possible events (all subsets of the sample space). The number of events in a sample space with n simple events is 2^{n}.
Probability Function Definition: A function P: S \rightarrow [0, 1] that assigns probabilities subject to:
1. 0 \leq P(E) \leq 1 for every event E.
2. P(S) = 1 (Guaranteed/Certain event).
3. P(\emptyset) = 0 (Impossible event).
4. Addition Rule: P(A \cup B) = P(A) + P(B) - P(A \cap B).
Computing Probability with Finite Sample Space:
- P(E) = \frac{n(E)}{n(S)} (Size of event divided by size of sample space).
- This is equivalent to the population relative frequency.

Unions and Intersections

Intersection (AND): Corresponds to symbols \cap. The event occurs only if both A and B occur.
- P(A \cap B) = P(A \text{ and } B).
Union (OR): Corresponds to symbols \cup. Inclusive OR: occurs if A occurs, B occurs, or both occur.
- P(A \cup B) = P(A) + P(B) - P(A \cap B).
Mutually Exclusive Events: Events that cannot occur at the same time.
- A \cap B = \emptyset.
- P(A \cap B) = 0.
- Addition Rule for excluding events: P(A \cup B) = P(A) + P(B).
Partition: A set of events E_k forms a partition of sample space S if they are pairwise mutually exclusive and their union equals S. The sum of probabilities of a partition is always 1.
Complementary Events: The complement of A (A^{c} or A') is the event that A does not occur.
- P(A) + P(A^{c}) = 1.
- P(A) = 1 - P(A^{c}).

Odds and Probabilities

Probability vs. Odds: Probability is a part-to-whole relationship (E/S). Odds is a part-to-part ratio.
Odds Against A: Ratio of failures to successes, denoted A^{c} : A.
Odds For A: Ratio of successes to failures, denoted A : A^{c}.
Conversion Formulas:
- If odds against are c:d, then P(A) = \frac{d}{c+d}.
- If P(A) = p, then odds against are (1-p) : p.
Gaming context: Payoff "odds" in gambling are usually a payoff ratio, not actual mathematical odds. In a fair game, payoff odds equal actual odds against. Casinos maintain a "house advantage."

Empirical Probability and Law of Large Numbers

Empirical (Experimental) Probability: The sample relative frequency of an event.
Simulation (Monte Carlo Methods): Using experiments to find empirical probabilities.
Law of Large Numbers: As the number of trials increases, the experimental relative frequency becomes more likely to be close to the actual theoretical probability.
Fallacy Check: The "Law of Averages" is erroneous; independent trials (like coin flips) are not "due" for a specific result based on previous outcomes.

Conditional Probability

Definition: The likelihood of event A occurring given that event B has already occurred.
- P(A | B) = \frac{P(A \cap B)}{P(B)}.
Finite Sample Space Calculation: P(A | B) = \frac{n(A \cap B)}{n(B)}.
Multiplication Rule: P(A \cap B) = P(B) \cdot P(A | B) = P(A) \cdot P(B | A).
Tables: Cross-tabulation (contingency) tables show marginal probabilities on the outer edges and joint probabilities in the internal cells.

Discrete Random Variables and PDF/CDF

Random Variable (X): A function that assigns a unique number to each outcome in a sample space.
Probability Density Function (PDF):
- Maps X values to [0, 1].
- \sum pdf(x) = 1.
- pdf(a) = P(X = a).
Cumulative Density Function (CDF):
- Gives accumulated probability up to a value.
- cdf(a) = P(X \leq a).
- Graph is a step function (broken line graph).
Parameters:
- Mean (Expected Value, \mu): \mu = E[X] = \sum [x \cdot pdf(x)].
- Variance (\sigma^{2}): \sigma^{2} = E[(X - \mu)^{2}] = \sum [(x - \mu)^{2} \cdot pdf(x)].
- Standard Deviation (\sigma): \sigma = \sqrt{\sigma^2}.
- Skewness: \sum [(\frac{x-\mu}{\sigma})^{3} \cdot pdf(x)]. Measures deviations from symmetry.
Transformations:
- Adding constant c: \mu{X+c} = \muX + c; \sigma^2 remains unchanged.
- Multiplying by constant c: \mu{cX} = c\muX; \sigma{cX} = |c|\sigmaX.
- Adding independent variables: \mu{X+Y} = \muX + \muY; \sigma^2{X+Y} = \sigma^2X + \sigma^2Y.

Multistage Probability and Trees

Probability Trees: Used to visualize multiple stages.
- Root vertex starts the tree.
- Edges from a vertex must sum to 1.
- First stage edges are basic probabilities; subsequent edges are conditional.
- Multiply probabilities along a path to find the probability of the intersection (joint probability).
Monty Hall Problem: A counterintuitive result where switching doors doubles the winning probability from 1/3 to 2/3 because the host's knowledge changes the conditional probability of the remaining door.

Independent Events

Definition: Two events are independent if the occurrence of one does not affect the probability of the other.
Conditions (Equivalent):
1. P(A | B) = P(A)
2. P(B | A) = P(B)
3. P(A \cap B) = P(A) \cdot P(B)
Note: Mutually exclusive events are not independent (if they are mutually exclusive, knowing B happened tells you A cannot happen).

Bayesian Techniques

Core Concept: Comparing two related trees (reversing the condition). One tree starts with partition B and leads to A; the other starts with A and leads to B.
Bayes' Theorem: Allows finding P(Bk | A) given P(A | Bk) and P(B_k).
- Formula: P(Bk | A) = \frac{P(Bk) \cap P(A | Bk)}{\sum [P(Bi) \cdot P(A | B_i)]}.

Combinatorics (Counting Techniques)

Factorial (n!): Number of ways to order n distinct objects. n! = n(n-1) \dots (1). Note: 0! = 1.
Power Rule: Choosing k from n with replacement (repeats allowed), order matters: n^{k}.
Permutations (P(n, k)): Choosing k from n without replacement, order matters: P(n, k) = \frac{n!}{(n-k)!}.
Combinations (C(n, k)): Choosing k from n without replacement, order does not matter: C(n, k) = \binom{n}{k} = \frac{n!}{k!(n-k)!}.
Pascal's Triangle: Illustrates the recursive property \binom{n-1}{k-1} + \binom{n-1}{k} = \binom{n}{k}.
Indistinguishable Objects: Arranging n objects where there are types (n1, n2, \dots): \frac{n!}{n1! n2! \dots n_k!}.
Stars and Bars (Extension): Choosing k from n with replacement, order does not matter: C(n+k-1, k).

Named Families of Discrete Distributions

Discrete Uniform: All n outcomes are equally likely (P(X=x) = 1/n).
Bernoulli: Single trial with two outcomes: success (p) and failure (q).
- \mu = p, \sigma^{2} = pq.
Geometric: Number of independent Bernoulli trials (p) until the first success.
- pdf(x) = pq^{x-1}.
- \mu = 1/p, \sigma^{2} = q/p^{2}.
Binomial: Number of successes in n independent Bernoulli trials (p).
- pdf(x) = \binom{n}{x} p^{x} q^{n-x}.
- \mu = np, \sigma^{2} = npq.
Hypergeometric: Successes in a sample (n) from a finite population (N) with a specific number of successes (M), without replacement.
- pdf(x) = \frac{\binom{M}{x} \binom{N-M}{n-x}}{\binom{N}{n}}.
- \mu = np, where p=M/N.
- \sigma = \sqrt{npq} \cdot \sqrt{\frac{N-n}{N-1}} (Finite Population Correction Factor).
Poisson: Number of occurrences over a fixed interval (time/space).
- pdf(x) = \frac{\mu^{x} e^{-\mu}}{x!}.
- \mu = \sigma^2.