Law of Total Probability - Comprehensive Notes
Law of Total Probability - Detailed Notes
Law of Total Probability - Intuition and Setup
- Sample space S with a collection of events B1, B2, …, Bk that are mutually exclusive and collectively exhaustive (a partition of the sample space).
- Mutually exclusive: for i ≠ j, Bi ∩ Bj = ∅.
- Exhaustive: B1 ∪ B2 ∪ … ∪ Bk = S.
- For any event A ⊆ S, A is the union of its intersections with the partition elements:
A = igcup{i=1}^{k} igl(A \,\cap\, Biigr). - Because the intersections A ∩ Bi are disjoint (due to the partition), the probability of A is the sum of the probabilities of these intersections:
P(A) = \sum{i=1}^{k} P\bigl(A \cap B_i\bigr).
- If the Bi are not mutually exclusive, parts of A could be counted more than once; if the Bi are not exhaustive, some parts of A could be missed.
- For the special case of two partition events, B and B^c (the complement of B):
P(A) = P(A \cap B) + P(A \cap B^{c}). - The above ideas generalize to any countable partition of the sample space.
- Let B1, B2, …, B_K be mutually exclusive and exhaustive events in the sample space.
- For any event A in the sample space:
P(A) = \sum{i=1}^{K} P(A \cap Bi). - Using the multiplication rule (product rule):P(A \cap Bi) = P(Bi) \cdot P(A \mid B_i).
- Substituting into the sum gives the conditional form of the law:
P(A) = \sum{i=1}^{K} P(Bi) \cdot P(A \mid B_i). - For the two-part case (K = 2):
P(A) = P(A \mid B) \cdot P(B) + P(A \mid B^{c}) \cdot P(B^{c}). - This result follows directly from basic probability rules and the tree/partition intuition; the law of total probability is essentially the practical embodiment of the product rule across a partition.
Intuition, Diagrams, and Visualization
- Diagrams help: a partition of the sample space into B1, B2, B3, …; A is dissected into A ∩ B1, A ∩ B2, A ∩ B3, …
- Tree diagrams visualize the same idea: first branch by the partition event (e.g., machine M1, M2, M3 or urn choice), then branch by the conditional outcomes (e.g., defective vs not defective; color conditional on urn).
- The unconditional probabilities live on the top-level branches (P(Bi)); the conditional probabilities live on the subsequent branches (P(A|Bi)). The total probability is the sum of the products along each complete path to A.
- The law provides a way to compute P(A) when A is complex, by breaking the problem into simpler, conditional pieces.
Practical Example 1: Factory Defects (Machines M1, M2, M3)
- Given:
- P(M1) = 0.60, P(M2) = 0.30, P(M3) = 0.10 (these are mutually exclusive and exhaustive).
- P(D \mid M1) = 0.07, P(D \mid M2) = 0.15, P(D \mid M3) = 0.30, where D is the event “part is defective.”
- Compute P(D) using the law of total probability:
P(D) = P(M1) \cdot P(D \mid M1) + P(M2) \cdot P(D \mid M2) + P(M3) \cdot P(D \mid M3).
P(D) = 0.60 \cdot 0.07 + 0.30 \cdot 0.15 + 0.10 \cdot 0.30 = 0.042 + 0.045 + 0.030 = 0.117. - Note: The transcript lists 0.171 for this example, but the correct calculation above yields 0.117. This appears to be a numerical error in the transcript.
- Alternative viewpoint: a tree diagram would have branches for M1, M2, M3 with probabilities 0.60, 0.30, 0.10; from each branch, a sub-branch for D with probabilities 0.07, 0.15, 0.30 respectively; summing the four path probabilities that lead to D gives 0.117.
Practical Example 2: Urns and Colors
- Setup:
- One of four urns is chosen uniformly at random: P(Urn i) = 1/4 for i = 1,2,3,4.
- Conditional probability of drawing a blue ball from each urn:
- Urn 1: P(B|Urn1) = 4/10
- Urn 2: P(B|Urn2) = 1/7
- Urn 3: P(B|Urn3) = 3/8
- Urn 4: P(B|Urn4) = 5/9
- Apply the law:
P(B) = \sum_{i=1}^{4} P(B \mid Urn i) \cdot P(Urn i) = \frac{1}{4}\left( \frac{4}{10} + \frac{1}{7} + \frac{3}{8} + \frac{5}{9} \right). - Compute inside the parentheses:
- ( \frac{4}{10} = 0.4, \ \frac{1}{7} \approx 0.142857, \ \frac{3}{8} = 0.375, \ \frac{5}{9} \approx 0.555556 )
- Sum ≈ 1.473413
- Therefore:
P(B) \approx \frac{1}{4} \times 1.473413 \approx 0.368353. - Result: P(Blue) ≈ 0.368 (about 0.37).
- Quick thought: If you placed all balls in a single urn and mixed them, the probability of drawing a blue ball would be the weighted average of the blue proportions across the original urns; conceptually the same final probability as long as you account for the mixture weights correctly.
Tree Diagrams and Quick Calculations
- Tree diagrams show the partition on the first level (e.g., which machine or which urn).
- The second level shows conditional probabilities given the first-level choice (e.g., D vs not D, blue vs not blue).
- The probability of an outcome is found by summing the products along each complete path to that outcome.
- Once comfortable, many calculations can be done quickly without drawing the diagram.
Why the Law of Total Probability Matters
- It allows combining conditional information with the likelihoods of the conditioning events to obtain overall probabilities.
- It can simplify complex problems by breaking them into manageable pieces.
- It underpins Bayes’ Theorem, which will be covered in a later topic.
Special Case and Notation Reminders
- Partition: B1, B2, …, BK are mutually exclusive and exhaustive: for i ≠ j, Bi ∩ Bj = ∅ and ∪{i=1}^K B_i = S.
- A is any event in S.
- Key formulas:
A = \bigcup{i=1}^{K} (A \cap Bi)
P(A) = \sum{i=1}^{K} P(A \cap Bi)
P(A \cap Bi) = P(Bi) \cdot P(A \mid Bi)
P(A) = \sum{i=1}^{K} P(Bi) \cdot P(A \mid Bi) - Two-case form (K = 2):
P(A) = P(A \mid B) \cdot P(B) + P(A \mid B^{c}) \cdot P(B^{c}).
Summary of Key Takeaways
- The law of total probability lets you compute P(A) by conditioning on a partition of the sample space.
- The core identity: P(A) = \sum{i=1}^{K} P(Bi) \cdot P(A \mid B_i).
- It is essential to ensure the B_i form a partition (mutually exclusive and exhaustive) to avoid double counting or missing parts.
- It connects directly to the product rule and is a foundational tool in probabilistic reasoning and Bayesian methods.
- Practical applications include quality control (defect analysis), reliability, and decision making under uncertainty.