Probability Notes: Simple Events, Notation, and Basic Rules
Simple events vs. events
- A die has six sides; the sample space for a single roll is S = {1,2,3,4,5,6}.
- A simple event is a single outcome from the sample space (e.g., rolling a 4).
- An event is any subset of the sample space (i.e., a collection of simple events).
- Probability basics: when all simple events are equally likely, the probability of an event E is the sum of the probabilities of the simple events in E.
- If E contains k outcomes, then for a fair die with 6 sides: P(E)=6k.
- The transcript contrasts an event with a simple event: an event can be a combination of several simple outcomes.
- Example concept (qualifying a scenario): the probability of a specific combination when rolling a die multiple times or flipping a coin multiple times.
- If you flip a coin three times, the sample space has 2^3 = 8 equally likely sequences of H/T.
- The event "H on the first flip and T on the third flip" consists of the two sequences {H, H, T} and {H, T, T}, so P(extHon1standTon3rd)=82=41.
- Summary: simple event is one outcome; event is a set of outcomes; probabilities add over the outcomes in the event.
Notation and symbols used in probability
- P(A) denotes the probability of event A.
- Intersection: P(A∩B) is the probability that both A and B occur.
- Union: P(A∪B) is the probability that at least one of A or B occurs.
- Complement: Ac (or sometimes \bar{A}) is the event that A does not occur; P(Ac)=1−P(A).
- Conditional probability: P(A∣B)=P(B)P(A∩B), provided $P(B)>0$.
- The vertical bar "|" denotes conditionality; the bar with a superscript c denotes complement; the dot in between events denotes "and".
- Key takeaway on notation:
- A occurs with B? use P(A∣B).
- Both A and B occur? use P(A∩B).
- Either A or B (or both)? use P(A∪B).
Fundamental rules and examples
- Axiom of total probability for a finite sample space: the probabilities of all simple events sum to 1.
- If S is the sample space for a finite experiment, then ∑e∈SP(e)=1.
- For a finite event E (a subset of S): P(E)=∑e∈EP(e).
- If all simple outcomes are equally likely, and E contains k outcomes, then P(E)=∣S∣k.
- Complement rule: if A is an event, then P(Ac)=1−P(A).
- Addition (Union) rule: P(A∪B)=P(A)+P(B)−P(A∩B).
- If A and B are independent, then P(A∩B)=P(A)P(B).
- Relationship between independence and conditional probability: if A and B are independent, then P(A∣B)=P(A)andP(B∣A)=P(B).
- Example intuition: the probability of having a dog or a cat (A or B) involves considering overlap (A and B) if both conditions can occur simultaneously.
Worked example framework from the transcript
- Example setup used in teaching:
- A contingency table with 2 attributes (e.g., gender: Male/Female; smoking status: Smoker/Non-smoker).
- Total individuals: e.g., 250 employees.
- Known margin: number who smoke cigarettes: 130.
- Therefore, non-smokers: 120.
- The point: with some margins, you can fill the rest of a 2x2 table and then convert counts to proportions.
- Important caution: the transcript indicates a question about the probability that a randomly chosen person is a nonsmoker conditional on gender (e.g., female). To compute this, you typically need the count of female nonsmokers (or total females plus one of the margins).
- Probability of a nonsmoker given female would be: P(Non-smoker∣Female)=P(Female)P(Female∩Non-smoker). If you do not know the joint count, you cannot compute it without an independence assumption or additional data.
- Independence assumption (if justified): if gender and smoking status are independent, then P(Non-smoker∣Female)=P(Non-smoker)=250120=0.48.
- Without independence, you would need the actual joint counts to compute the conditional probability.
- The process students should follow:
- Fill margins with known totals.
- Use the relationships a+b=F, c+d=M, a+c=130, b+d=120 (where a=Female & Smoker, b=Female & Non-smoker, c=Male & Smoker, d=Male & Non-smoker).
- Compute the desired probability (e.g., nonsmoker given female) as b/F.
Practical concepts: years, attrition, and interpreting percentages
- The transcript mentions attrition percentages: e.g., 13.9% leave in the first year and 7% in the second year.
- A note on interpretation: summing yearly percentages directly can lead to values that exceed 100% if each year’s percentage is measured as a portion of the initial cohort and treated independently.
- Correct cumulative interpretation: if p1 is the probability of leaving in year 1 and p2 in year 2, the probability of leaving by the end of year 2 is:
- P(leave by year 2)=p<em>1+p</em>2−p<em>1p</em>2=1−(1−p<em>1)(1−p</em>2).
- In the transcript, the statement that the ten-year total equals 250% appears inconsistent with standard probability; ensure to interpret percentages as probabilities (0 to 1) or as properly scaled rates (e.g., annual attrition rates with a cumulative formula).
The probability of A or B (union) and common examples
- The rule for union of two events general form: P(A∪B)=P(A)+P(B)−P(A∩B).
- Example: if you own a dog (A) or a cat (B): the probability of having either a dog or a cat is given by the sum of their individual probabilities minus the overlap where you have both.
- If A and B are independent, an alternative form: P(A∪B)=P(A)+P(B)−P(A)P(B).
- This is a practical way to connect independence to the union probability.
Key takeaways for exams
- Distinguish simple events from events:
- Simple event: a single outcome.
- Event: a set of outcomes.
- Master the core formulas:
- Complement: P(Ac)=1−P(A)
- Intersection: P(A∩B)
- Union: P(A∪B)=P(A)+P(B)−P(A∩B)
- Conditional: P(A∣B)=P(B)P(A∩B)
- Independence: if A and B are independent, P(A∩B)=P(A)P(B) and hence P(A∪B)=P(A)+P(B)−P(A)P(B).
- For a finite, fair die, if E is any event with k outcomes: P(E)=6k; in general, probabilities of simple events must sum to 1: ∑e∈SP(e)=1.
- Practice applying these ideas to (a) simple experiments like coin flips or die rolls, and (b) contingency tables with margins to compute conditional probabilities.
- Always check whether an independence assumption is justified before using P(A|B)=P(A) or P(A|B)=P(B); if not justified, rely on the joint counts or additional data.
Reflective notes and connections
- This material connects basic probability theory to data interpretation (contingency tables, marginals, and conditional probabilities).
- It underpins decision making under uncertainty, risk assessment, and the interpretation of percentages in real-world data (e.g., attrition, demographics).
- Ethical and practical implications: assumptions about independence can significantly change conclusions; ensure transparency about assumptions when reporting probabilities.