Probability and Random Variables Notes

Probability Models and Simulation

  • Identifying Sample Space (Probability Model): Refers to defining all possible outcomes of a random phenomenon.

  • Simulation: A method to model random events, particularly useful when theoretical probabilities are hard to compute. Follow these steps:

    1. State the problem or question: Clearly define what you are trying to investigate.

    2. State the assumptions: Identify if events are independent.

    3. Describe the process for one repetition:

      • Specify possible outcomes.

      • Assign representations (numerical or other) to these outcomes.

      • Define measured variables.

      • Which assigned values will represent each outcome, will any values be omitted, how will repeated values work, how will one trial be simulated

    4. Simulate repetitions: Show random numbers and corresponding outcomes.

    5. State your conclusion: Interpret the simulation results in the original problem's context.

  • Two-Way Tables: Used to display and analyze the relationship between two categorical variables.

  • Venn Diagrams: Visual representations of events and their relationships, including intersections and unions.

  • Tree Diagrams: Useful for displaying sequences of events and their probabilities.

  • Formulas: Essential tools for calculating probabilities.

Law of Large Numbers

The law of large numbers states that as the number of trials in a random process increases, the proportion of times a specific outcome occurs will approach its true probability.

Discrete Random Variables

  • Consist of a fixed set of possible values with gaps between them (countable).

  • Expected Value (Mean), E(X), of a discrete random variable: The average value over many trials.

    • E(X) = \sum xi * P(xi), where xi are the possible values and P(xi) are their probabilities.

  • Standard Deviation, σ_X: Measures the typical variation of the variable's values from the mean over many trials.

Approaches to Probability

  • Probability is a number between 0 and 1 that describes the proportion of times an outcome would occur in a very long series of trials.

Key Probability Rules

  • Complement Rule: The probability of an event occurring is one minus the probability it will not occur.

    • P(A) = 1 - P(A^c)

  • Mutually Exclusive (Disjoint) Events: Two events (A and B) with no outcomes in common; P(A \text{ and } B) = 0.

  • Conditional Probability: The probability of event A occurring, given that event B has already occurred; denoted as P(A | B).

    • P(A | B) = \frac{P(A \text{ and } B)}{P(B)}

  • Independent Events: Events A and B are independent if knowing whether one occurred doesn't change the probability of the other.

    • P(A \text{ and } B) = P(A) * P(B)

    • Alternatively, P(A | B) = P(A)

Continuous Random Variables

  • Can take on any numerical value within an interval on the number line (not countable).

  • The probability of any event involving a continuous random variable is the area under the density curve above the values on the horizontal axis that make up the event.

  • If the density curve is approximately normal, the area can be found using normalCDF on a calculator or a standard normal table.

Binomial vs. Geometric Settings

Both involve repeated trials of a chance process, but they differ in what is measured:

  • Binomial: Counts the number of times a particular outcome (success) occurs in a fixed number of observations, n.

  • Geometric: Records how many trials it takes to get to the first success.

Conditions for Binomial and Geometric Settings

Both settings require verification of the following conditions:

  1. Two Possible Outcomes: Each trial results in either a “success” or a “failure”.

  2. Trials Must Be Independent: The outcome of one trial does not affect the outcome of any other trial.

  3. Same Probability of Success: The probability of success, p, is constant on each trial.

  4. Binomial: Fixed number of trials, n.

  5. Geometric: Continue until the first success, x.

Calculator Functions

  • Binomial:

    • BinomialPDF(n, p, x): Calculates the probability of exactly x successes in n trials.

    • BinomialCDF(n, p, x): Calculates the cumulative probability of at most x successes in n trials.

  • Geometric:

    • GeometricPDF(p, x): Calculates the probability of the first success occurring on trial x.

    • GeometricCDF(p, x): Calculates the probability of the first success occurring within x trials.

Tips for Using Calculator Functions

  • PDF gives an exact value.

  • CDF gives “at most” or “less than” probabilities.

  • To find “at least” or “greater than” probabilities, use 1 - CDF.

Normal Approximation for Binomial Probabilities

  • When n is large, a Normal probability model can approximate binomial probabilities if the Large Counts condition is met: np \geq 10 and n(1-p) \geq 10 (at least 10 successes and 10 failures).

AP Statistics Exam Tips

  • When describing a simulation, provide a clear explanation so the reader can replicate your results from your explanation alone.

  • Ensure every label is of the same length. For example, to assign 43% of outcomes to represent an event, use numbers 00-42, not 0-42.

  • When sampling without replacement, mention that repeated numbers should be ignored.

  • You do not need to simplify fractions or convert them to decimals when finding the probability of an event.

  • When determining a sample space, do not assume all outcomes are equally likely unless stated.

  • Show supporting work for probability problems, even if the calculations are simple.

  • Use the word “about” when interpreting simulation results, as probability is always an estimation.

  • Define events A and B clearly when using formulas like P(A | B); if not, use descriptive labels (e.g., P(\text{female} | \text{engineer})).

  • When checking independence, define events and substitute values into either P(A \text{ and } B) = P(A) * P(B) or P(A | B) = P(A).

General Calculation Tips

  • Do not round numbers at intermediate steps to maintain precision; round at the end to at least four decimal places.

  • Consider whether to include boundary values in calculations, especially with discrete random variables.

  • If the mean of a random variable is non-integer, report it as such for full credit.

  • Show numerical values substituted into formulas when calculating the mean or standard deviation of a discrete random variable on a free-response question.

  • Avoid “calculator speak”; label inputs clearly (e.g., binomialcdf(trials:10, probability:.2, at most 5 successes)).

  • Consider the appropriateness of a binomial setting before solving a probability question.

  • Check BINS (Binary, Independent, Number of trials fixed, Same probability of success) for binomial distributions.

  • Check BITS (Binary, Independent, To first success, Same probability of success) for geometric distributions.

  • Check the Large Counts condition for Normal Approximation of a Binomial Setting.

Combining Random Variables

  • We can always add/subtract means (expected values) when finding the sum/difference of two random variables.

  • However, to add/subtract standard deviations, the random variables (X and Y) must be independent, and we find the square root of the sum of the variances: SD(X \pm Y) = \sqrt{Var(X) + Var(Y)} , where Var(X) = SD(X)^2

Example

Expected Value = (0)(.10) + 1 (0.25) + 2(0.30) + 3(0.20) + 4(0.15) = 2.05 red lights