Unit 5: Probability Rules, Simulations, and Independence
Basic Probability Rules
Probability Model: * A probability model must show all possible outcomes in the sample space. * A probability model must show all probabilities for those outcomes. * Rule 1: The probability of each individual outcome must be a value between 0 and 1 (inclusive, representing to ). * Rule 2: The sum of all probabilities for the entire sample space must equal exactly 1 (expressing certainty that one of the outcomes in the sample space will occur).
Complement: * Definition: If is designated as event , then is the complement of event , also known as "not ." * Conceptual Meaning: The complement is the event that event did not happen. * Complement Rule Formula: . * Plain Language Explanation: The probability of the complement of is equivalent to "everything" ( or 1) minus the probability of event actually occurring.
Mutually Exclusive Events: * Definition: Events that have no outcomes in common. * Addition Rule for Mutually Exclusive Events: When events and are mutually exclusive, the probability of either event occurring is the sum of their individual probabilities: .
Randomness, Probability, and Simulation
Probability: * Definition: The likelihood that something happens in the "long run." * Nature of Randomness: Random phenomena are characterized as being unpredictable in the short run but becoming predictable in the long run.
Law of Large Numbers: * Definition: If a chance process is repeated many, many times, the proportion of desired outcomes obtained will approach the actual probability of that outcome.
Simulation: * Definition: The act of imitating a chance process, often used by statisticians to estimate probabilities when direct calculation is difficult or to verify theoretical models.
Two-Way Tables and Venn Diagrams
Two-Way Table: * A grid format (often ) used to organize data for two categorical variables, arranged in rows and columns.
Venn Diagram Components: * Circle: Represents a specific event. * Rectangle: Represents the entire sample space. * Intersection: The overlapping space in the Venn diagram representing the event where both designated outcomes occur simultaneously ( or "Both"). * Union: The total space covered by the events (), representing the occurrence of either event , event , or both.
General Addition Rule: * This rule is used when events are not necessarily mutually exclusive (there is an overlap/intersection). * Formula: . * Symbolism: The symbol denotes the union ("or"), and the symbol denotes the intersection ("and"). * Logic: One must subtract the intersection () because that probability is counted twice if you simply add and .
Conditional Probability and Independence
Conditional Probability: * Definition: The probability that event will occur given that event has already occurred. * Notation: Notated as , read as "probability of given ." * Formula: (the probability of both events occurring divided by the probability of the given condition).
Independence: * Test for Independence: Events and are independent if and only if . * Interpretation: Independence means that knowing whether or not event occurred does not change the probability/chance of event occurring.
Questions & Discussion
Application 5.2: How Prevalent is High Cholesterol? * Context: American adults chosen at random. Event : high cholesterol (); Event : borderline high cholesterol ( to ). Data: and . * Question 1: Explain why events and are mutually exclusive. * Answer: Events and are mutually exclusive because a randomly chosen American adult cannot have high cholesterol ( or above) and borderline high cholesterol ( to ) at the same time. * Question 2: Say in plain language what the event " " is, then find . * Plain Language: A randomly chosen American adult has either high or borderline high cholesterol. * Calculation: . * Question 3: Let be the event that the person chosen has normal cholesterol (less than ). Find . * logic: Normal is the complement of "borderline or high." * Calculation: . A randomly chosen American adult has a chance of having normal cholesterol.
Application 5.1: Will the Train Arrive on Time? * Context: NJ Transit claims its 8:00 a.m. train has probability of arriving on time. * Question 1: Explain what probability means in this setting. * Answer: The train has a chance of arriving on time in a large sample of many trips. * Question 2: The train arrived on time 5 days in a row. What is the probability it arrives on time tomorrow? * Answer: The probability remains . A short streak of on-time arrivals does not change the long-term probability. * Question 3: Describe a simulation using a 10-sided die for late arrivals ( of days late). * Answer: Assign digits through as "on time" () and digit as "late" (). Roll the die times and record whether the train is on time or late for each roll. * Question 4: Explain what the dot at on the dotplot represents. * Answer: It represents one repetition of the simulation (out of ) where the train arrived late exactly times out of rolls of the die. * Question 5: Estimate the probability that the train will arrive late on or more of days based on simulation results. * Answer: According to the dotplot, there are dots at or higher. or . * Question 6: Is there convincing evidence that New Jersey Transit's claim is false given late arrivals in days? * Answer: No. Because it is fairly likely ( chance) that the train will arrive late or more days out of just by chance, there is not convincing evidence the claim is false.
Application 5.3: Who Owns a Home? * Context: Random sample of U.S. adults. Event : High school graduate; Event : Homeowner. * Table Data: * HS Grad () and Homeowner (): * HS Grad () and Not Homeowner (): * Not HS Grad () and Homeowner (): * Not HS Grad () and Not Homeowner (): * Total HS Grads: ; Total Not HS Grads: ; Total Homeowners: ; Total Not Homeowners: . * Question 1: Find . * Answer: , which is a chance. * Question 2: Explain why , and find . * Answer: They are not equal because it is possible to be both a high school graduate and a homeowner (the events overlap). Simple addition would double-count the individuals who are both. * Calculation: . * Question 3: Structure of the Venn Diagram for this data. * G region only: * H region only: * Overlap (G and H): * Outside regions (Neither): * Question 4: Find . * Answer: .
Application 5.4: Who Earns A's in College? * Context: 10,000 grades from UNH categorized by School (Liberal Arts, EPS, Health) and Grade (A, B, Lower than B). Event : grade from EPS; Event : grade lower than a B. * Data Table: * EPS: (A), (B), (Lower than B); Total EPS = . * Total Lower than B (): . * Question 1: Find and describe it in words. * Answer: or . There is a chance that a random course grade is lower than a B, given that it is from an E.P.S. course. * Question 2: Are events and independent? Justify. * Answer: For independence, must equal . * Check: (). (). Since , the events are not independent. * Question 3: Given the grade is not lower than a B, find the probability it came from an EPS course. * Answer: Find . . . * Calculation: or .