Probability: Axioms, Independent Events, Binomial and Poisson Approximations
Probability Fundamentals
Axioms (basic foundations)
For any event A, the probability is nonnegative: P(A) \ge 0
The probability of the entire sample space S is 1: P(S) = 1
The probability of the null event (empty set) is 0: P(\emptyset) = 0
Events, sample space, and equal likelihood
An event A is a subset of the sample space S; outcomes are elementary events (atomic outcomes)
If all outcomes in S are equally likely, then for any A ⊆ S: P(A) = \frac{|A|}{|S|}
The complement of A is A^c (not A): P(A^c) = 1 - P(A)
The null set ∅ has probability 0; the entire space S has probability 1
Events and Set Operations
Union and intersection
For events A and B:
If A and B are disjoint (mutually exclusive): A \cap B = \emptyset and P(A \cup B) = P(A) + P(B)
In general: P(A \cup B) = P(A) + P(B) - P(A \cap B)
For three events A, B, C that are mutually exclusive: P(A \cup B \cup C) = P(A) + P(B) + P(C)
Partition and total probability
If {A1, A2, …, An} are disjoint and their union is S (a partition), then \sum{i=1}^n P(Ai) = 1
This expresses that the probabilities of all mutually exclusive outcomes cover the whole space
Null set and probability of outcomes
The probability of an event that cannot happen is 0: P(∅) = 0
Independence vs Mutual Exclusivity
Mutually exclusive vs independent
Mutually exclusive (disjoint): A \cap B = \emptyset, which implies P(A \cup B) = P(A) + P(B)
Independent: A and B are independent if and only if P(A \cap B) = P(A) \cdot P(B)
Note: independence does not imply mutual exclusivity; conversely, mutual exclusivity often implies dependence (except in trivial cases like P(A)=0 or P(B)=0)
Multiple events and independence
A collection of events {Ai} is independent if for every finite subset I, P\left( \bigcap{i \in I} Ai \right) = \prod{i \in I} P(A_i)
When events are independent, joint probabilities decompose into products
Examples (conceptual)
Two independent coin tosses with P(heads) = p: P(heads on both) = p^2
Mutually exclusive events example: A = {rolling a 1}, B = {rolling a 2} on a fair die: P(A) = P(B) = 1/6, P(A ∪ B) = 1/3
Cartesian Product and Joint Experiments
Cartesian product as joint sample space
If two experiments have outcome sets A and B with sizes |A| = m and |B| = r, then the joint outcome space is A × B with size |A × B| = m r
If the experiments are independent, the joint probability of a pair (a, b) is the product of their marginals: P((a,b)) = PA(a) \cdot PB(b)
Implication for total outcomes
When considering multiple experiments, the joint distribution lives on the Cartesian product of their outcome spaces
Bernoulli Trials and Binomial Distribution
Bernoulli trial
A single trial with two outcomes: success with probability p and failure with probability q = 1 - p
Random variable X for a single trial: X = 1 if success, X = 0 if failure
Distribution: P(X=1) = p, \quad P(X=0) = 1-p = q
Binomial setting: n independent Bernoulli trials
Let X be the total number of successes in n independent Bernoulli trials with parameter p
Then X ~ Binomial(n, p)
Probability mass function (pmf):
P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}, \quad k = 0,1,\dots,nDerivation outline: choose which k of the n trials are successes, each selection has probability p^k (1-p)^{n-k}, and there are \binom{n}{k} such selections
Special cases: P(X=0) = $(1-p)^n$, P(X=n) = p^n
Binomial coefficient: \binom{n}{k} = \frac{n!}{k!(n-k)!}
Mean and variance (key properties)
Expected number of successes: E[X] = np
Variance: \mathrm{Var}(X) = np(1-p) = npq
Identical Bernoulli trials
When trials are identical with the same p and independent, the binomial model applies uniformly across the sequence of trials
Examples
Example 1: n = 5, p = 0.4, compute P(X=3)
P(X=3) = \binom{5}{3} (0.4)^3 (0.6)^2 = 10 \cdot 0.064 \cdot 0.36 = 0.2304.
Example 2: n = 10, p = 0.2, compute P(X=0) and P(X=10)
P(X=0) = $(0.8)^{10}$, P(X=10) = $(0.2)^{10}$
Poisson Approximation to the Binomial
Poisson limit for rare events in many trials
When n is large and p is small with λ = np kept finite, the Binomial(n, p) distribution can be approximated by a Poisson distribution with parameter λ
Poisson pmf: P(X = k) \approx e^{-\lambda} \frac{\lambda^k}{k!}, \quad k = 0,1,2,\dots
Condition: λ = np, and p is small enough that np remains moderate
Context: often used in quality control, arrivals, and rare-event modeling
Examples
If n = 1000, p = 0.01, then λ = 10. Approximate P(X = 3) by e^{-10} \frac{10^3}{3!}
As n grows with p shrinking so that np = λ, the Poisson approximation improves for small k
Quick Reference Formulas
Axioms
P(A) \ge 0 for any event A
P(S) = 1, where S is the sample space
P(\emptyset) = 0
Disjoint events
If A1, A2, …, An are pairwise disjoint (mutually exclusive):
P\left( \bigcup{i=1}^n Ai \right) = \sum{i=1}^n P(Ai)
Complement
P(A^c) = 1 - P(A)
Independence
A and B are independent if: P(A \cap B) = P(A) P(B)
For a finite collection of independent events {Ai}: P\left( \bigcap{i} Ai \right) = \prodi P(A_i)
Binomial distribution
X \sim \text{Binomial}(n, p)
P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}, \quad k = 0,1,\dots,n
\binom{n}{k} = \frac{n!}{k!(n-k)!}
E[X] = np, \quad \mathrm{Var}(X) = np(1-p)
Poisson distribution (approximation)
If X \sim \text{Binomial}(n, p) with \lambda = np and p is small, then
P(X = k) \approx e^{-\lambda} \frac{\lambda^k}{k!}
Notation reminder
Event A, B, C denote subsets of S; A^c denotes the complement; ∅ is the null event; S is the sample space
The product rule for independent events extends to any finite or countable collection when independence holds
End of notes