Probability and Random Variables Notes

Probability Models and Simulation

Identifying Sample Space (Probability Model): Refers to defining all possible outcomes of a random phenomenon.
Simulation: A method to model random events, particularly useful when theoretical probabilities are hard to compute. Follow these steps:
1. State the problem or question: Clearly define what you are trying to investigate.
2. State the assumptions: Identify if events are independent.
3. Describe the process for one repetition:
  - Specify possible outcomes.
  - Assign representations (numerical or other) to these outcomes.
  - Define measured variables.
  - Which assigned values will represent each outcome, will any values be omitted, how will repeated values work, how will one trial be simulated
4. Simulate repetitions: Show random numbers and corresponding outcomes.
5. State your conclusion: Interpret the simulation results in the original problem's context.
Two-Way Tables: Used to display and analyze the relationship between two categorical variables.
Venn Diagrams: Visual representations of events and their relationships, including intersections and unions.
Tree Diagrams: Useful for displaying sequences of events and their probabilities.
Formulas: Essential tools for calculating probabilities.

Law of Large Numbers

The law of large numbers states that as the number of trials in a random process increases, the proportion of times a specific outcome occurs will approach its true probability.

Discrete Random Variables

Consist of a fixed set of possible values with gaps between them (countable).
Expected Value (Mean), $E(X)$ , of a discrete random variable: The average value over many trials.
- $E(X) = \sum x<em>i * P(x</em>i)$ , where $x<em>i$ are the possible values and $P(x</em>i)$ are their probabilities.
Standard Deviation, $σ_X$ : Measures the typical variation of the variable's values from the mean over many trials.

Approaches to Probability

Probability is a number between 0 and 1 that describes the proportion of times an outcome would occur in a very long series of trials.

Key Probability Rules

Complement Rule: The probability of an event occurring is one minus the probability it will not occur.
- $P(A) = 1 - P(A^c)$
Mutually Exclusive (Disjoint) Events: Two events (A and B) with no outcomes in common; $P(A \text{ and } B) = 0$ .
Conditional Probability: The probability of event A occurring, given that event B has already occurred; denoted as $P(A | B)$ .
- $P(A | B) = \frac{P(A \text{ and } B)}{P(B)}$
Independent Events: Events A and B are independent if knowing whether one occurred doesn't change the probability of the other.
- $P(A \text{ and } B) = P(A) * P(B)$
- Alternatively, $P(A | B) = P(A)$

Continuous Random Variables

Can take on any numerical value within an interval on the number line (not countable).
The probability of any event involving a continuous random variable is the area under the density curve above the values on the horizontal axis that make up the event.
If the density curve is approximately normal, the area can be found using normalCDF on a calculator or a standard normal table.

Binomial vs. Geometric Settings

Both involve repeated trials of a chance process, but they differ in what is measured:

Binomial: Counts the number of times a particular outcome (success) occurs in a fixed number of observations, $n$ .
Geometric: Records how many trials it takes to get to the first success.

Conditions for Binomial and Geometric Settings

Both settings require verification of the following conditions:

Two Possible Outcomes: Each trial results in either a “success” or a “failure”.
Trials Must Be Independent: The outcome of one trial does not affect the outcome of any other trial.
Same Probability of Success: The probability of success, $p$ , is constant on each trial.
Binomial: Fixed number of trials, $n$ .
Geometric: Continue until the first success, $x$ .

Calculator Functions

Binomial:
- BinomialPDF(n, p, x): Calculates the probability of exactly $x$ successes in $n$ trials.
- BinomialCDF(n, p, x): Calculates the cumulative probability of at most $x$ successes in $n$ trials.
Geometric:
- GeometricPDF(p, x): Calculates the probability of the first success occurring on trial $x$ .
- GeometricCDF(p, x): Calculates the probability of the first success occurring within $x$ trials.

Tips for Using Calculator Functions

PDF gives an exact value.
CDF gives “at most” or “less than” probabilities.
To find “at least” or “greater than” probabilities, use $1 - CDF$ .

Normal Approximation for Binomial Probabilities

When $n$ is large, a Normal probability model can approximate binomial probabilities if the Large Counts condition is met: $np \geq 10$ and $n(1-p) \geq 10$ (at least 10 successes and 10 failures).

AP Statistics Exam Tips

When describing a simulation, provide a clear explanation so the reader can replicate your results from your explanation alone.
Ensure every label is of the same length. For example, to assign 43% of outcomes to represent an event, use numbers 00-42, not 0-42.
When sampling without replacement, mention that repeated numbers should be ignored.
You do not need to simplify fractions or convert them to decimals when finding the probability of an event.
When determining a sample space, do not assume all outcomes are equally likely unless stated.
Show supporting work for probability problems, even if the calculations are simple.
Use the word “about” when interpreting simulation results, as probability is always an estimation.
Define events A and B clearly when using formulas like $P(A | B)$ ; if not, use descriptive labels (e.g., $P(\text{female} | \text{engineer})$ ).
When checking independence, define events and substitute values into either $P(A \text{ and } B) = P(A) * P(B)$ or $P(A | B) = P(A)$ .

General Calculation Tips

Do not round numbers at intermediate steps to maintain precision; round at the end to at least four decimal places.
Consider whether to include boundary values in calculations, especially with discrete random variables.
If the mean of a random variable is non-integer, report it as such for full credit.
Show numerical values substituted into formulas when calculating the mean or standard deviation of a discrete random variable on a free-response question.
Avoid “calculator speak”; label inputs clearly (e.g., binomialcdf(trials:10, probability:.2, at most 5 successes)).
Consider the appropriateness of a binomial setting before solving a probability question.
Check BINS (Binary, Independent, Number of trials fixed, Same probability of success) for binomial distributions.
Check BITS (Binary, Independent, To first success, Same probability of success) for geometric distributions.
Check the Large Counts condition for Normal Approximation of a Binomial Setting.

Combining Random Variables

We can always add/subtract means (expected values) when finding the sum/difference of two random variables.
However, to add/subtract standard deviations, the random variables (X and Y) must be independent, and we find the square root of the sum of the variances: $SD(X \pm Y) = \sqrt{Var(X) + Var(Y)}$ , where $Var(X) = SD(X)^2$

Example

Expected Value = $(0)(.10) + 1 (0.25) + 2(0.30) + 3(0.20) + 4(0.15) = 2.05$ red lights