Probability: Axioms, Independent Events, Binomial and Poisson Approximations

Axioms (basic foundations)
- For any event A, the probability is nonnegative: $P(A) \ge 0$
- The probability of the entire sample space S is 1: $P(S) = 1$
- The probability of the null event (empty set) is 0: $P(\emptyset) = 0$
Events, sample space, and equal likelihood
- An event A is a subset of the sample space S; outcomes are elementary events (atomic outcomes)
- If all outcomes in S are equally likely, then for any A ⊆ S: $P(A) = \frac{|A|}{|S|}$
- The complement of A is A^c (not A): $P(A^c) = 1 - P(A)$
- The null set ∅ has probability 0; the entire space S has probability 1

Union and intersection
- For events A and B:
- If A and B are disjoint (mutually exclusive): $A \cap B = \emptyset$ and $P(A \cup B) = P(A) + P(B)$
- In general: $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
- For three events A, B, C that are mutually exclusive: $P(A \cup B \cup C) = P(A) + P(B) + P(C)$
Partition and total probability
- If {A1, A2, …, An} are disjoint and their union is S (a partition), then $\sum{i=1}^n P(Ai) = 1$
- This expresses that the probabilities of all mutually exclusive outcomes cover the whole space
Null set and probability of outcomes
- The probability of an event that cannot happen is 0: P(∅) = 0

Mutually exclusive vs independent
- Mutually exclusive (disjoint): $A \cap B = \emptyset$ , which implies $P(A \cup B) = P(A) + P(B)$
- Independent: A and B are independent if and only if $P(A \cap B) = P(A) \cdot P(B)$
- Note: independence does not imply mutual exclusivity; conversely, mutual exclusivity often implies dependence (except in trivial cases like P(A)=0 or P(B)=0)
Multiple events and independence
- A collection of events {Ai} is independent if for every finite subset I, $P\left( \bigcap{i \in I} Ai \right) = \prod{i \in I} P(A_i)$
- When events are independent, joint probabilities decompose into products
Examples (conceptual)
- Two independent coin tosses with P(heads) = p: P(heads on both) = p^2
- Mutually exclusive events example: A = {rolling a 1}, B = {rolling a 2} on a fair die: P(A) = P(B) = 1/6, P(A ∪ B) = 1/3

Cartesian product as joint sample space
- If two experiments have outcome sets A and B with sizes |A| = m and |B| = r, then the joint outcome space is A × B with size |A × B| = m r
- If the experiments are independent, the joint probability of a pair (a, b) is the product of their marginals: $P((a,b)) = PA(a) \cdot PB(b)$
Implication for total outcomes
- When considering multiple experiments, the joint distribution lives on the Cartesian product of their outcome spaces

Bernoulli trial
- A single trial with two outcomes: success with probability $p$ and failure with probability $q = 1 - p$
- Random variable X for a single trial: X = 1 if success, X = 0 if failure
- Distribution: $P(X=1) = p, \quad P(X=0) = 1-p = q$
Binomial setting: n independent Bernoulli trials
- Let X be the total number of successes in n independent Bernoulli trials with parameter p
- Then X ~ Binomial(n, p)
- Probability mass function (pmf):
  $P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}, \quad k = 0,1,\dots,n$
- Derivation outline: choose which k of the n trials are successes, each selection has probability $p^k (1-p)^{n-k}$ , and there are $\binom{n}{k}$ such selections
- Special cases: P(X=0) = $(1-p)^n$, P(X=n) = p^n
- Binomial coefficient: $\binom{n}{k} = \frac{n!}{k!(n-k)!}$
Mean and variance (key properties)
- Expected number of successes: $E[X] = np$
- Variance: $\mathrm{Var}(X) = np(1-p) = npq$
Identical Bernoulli trials
- When trials are identical with the same p and independent, the binomial model applies uniformly across the sequence of trials
Examples
- Example 1: n = 5, p = 0.4, compute P(X=3)
- $P(X=3) = \binom{5}{3} (0.4)^3 (0.6)^2 = 10 \cdot 0.064 \cdot 0.36 = 0.2304.$
- Example 2: n = 10, p = 0.2, compute P(X=0) and P(X=10)
- P(X=0) = $(0.8)^{10}$, P(X=10) = $(0.2)^{10}$

Poisson limit for rare events in many trials
- When n is large and p is small with λ = np kept finite, the Binomial(n, p) distribution can be approximated by a Poisson distribution with parameter λ
- Poisson pmf: $P(X = k) \approx e^{-\lambda} \frac{\lambda^k}{k!}, \quad k = 0,1,2,\dots$
- Condition: λ = np, and p is small enough that np remains moderate
- Context: often used in quality control, arrivals, and rare-event modeling
Examples
- If n = 1000, p = 0.01, then λ = 10. Approximate P(X = 3) by $e^{-10} \frac{10^3}{3!}$
- As n grows with p shrinking so that np = λ, the Poisson approximation improves for small k

Axioms
- $P(A) \ge 0$ for any event A
- $P(S) = 1$ , where S is the sample space
- $P(\emptyset) = 0$
Disjoint events
- If A1, A2, …, An are pairwise disjoint (mutually exclusive):
- $P\left( \bigcup{i=1}^n Ai \right) = \sum{i=1}^n P(Ai)$
Complement
- $P(A^c) = 1 - P(A)$
Independence
- A and B are independent if: $P(A \cap B) = P(A) P(B)$
- For a finite collection of independent events {Ai}: $P\left( \bigcap{i} Ai \right) = \prodi P(A_i)$
Binomial distribution
- $X \sim \text{Binomial}(n, p)$
- $P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}, \quad k = 0,1,\dots,n$
- $\binom{n}{k} = \frac{n!}{k!(n-k)!}$
- $E[X] = np, \quad \mathrm{Var}(X) = np(1-p)$
Poisson distribution (approximation)
- If $X \sim \text{Binomial}(n, p)$ with $\lambda = np$ and p is small, then
- $P(X = k) \approx e^{-\lambda} \frac{\lambda^k}{k!}$
Notation reminder
- Event A, B, C denote subsets of S; A^c denotes the complement; ∅ is the null event; S is the sample space
- The product rule for independent events extends to any finite or countable collection when independence holds

End of notes