AP Statistics Study Guide Flashcards

AP Stats Study Guide

General Tips & Chapter 1: One Variable Data

Introduction to Statistics
- Statistics is the science and art of collecting, analyzing, and drawing conclusions from data.
- Data can be:
  - Categorical: Data that falls into categories.
  - Quantitative: Numerical data representing measurements or counts.
- Key Terms:
  - Individual: An object described in a set of data (also known as cases/observational units).
  - Variable: An aspect that can take different values for different individuals.
  - Distribution: The pattern of variation of a variable, showing what values the variable takes and how often it takes those values.
- Types of Statistics:
  - Descriptive Statistics: Analyzing data (Units 1-7).
  - Inferential Statistics: Making inferences/drawing conclusions from data (Units 8-12).
FRQs (Free Response Questions)
- Always include context: variable and units.
MCQs (Multiple Choice Questions) Strategies
- Process of elimination.
- Read actively and carefully.
- Underline important information.
- Anticipate answers.
Types of Variables
- Categorical:
  - Nominal: Categories with no inherent order.
  - Ordinal: Categories with a specific order.
    - Numbers can be ordinal if they don’t measure anything (e.g., cell phone digits).
- Quantitative:
  - Discrete: Fixed set of possible values with gaps between them.
    - Whole numbers or defined intervals.
    - Countable or countably infinite.
  - Continuous: Infinite possibilities.
    - Decimals/fractions.
    - Any value in an interval on the number line.
  - The main difference: whether they measure something.

1.1 Analyzing Categorical Data

Tables and Graphs
- Frequency Table: Shows counts for each category.
- Relative Frequency Table: Shows proportions/percentages for each category.
- Bar Graph: Graphical representation of categorical data with bars representing frequencies or relative frequencies.
  - Tips:
    - Equal width bars.
    - Leave gaps between bars.
    - Label and scale axes.
    - Indicate whether frequencies or relative frequencies are used.
- Pie Chart: Circular chart divided into slices proportional to frequencies or relative frequencies.
  - Good for: comparing categories to the whole.
  - Areas of slices proportional to frequencies/relative frequencies.
  - Must include all possible categories in the whole (add an “other” option if necessary).
  - Include a legend key.
- Two-Way Tables: Summarizes data on the relationship between two categorical variables for a group of individuals.
  - Marginal Relative Frequency: B/C
  - Joint Relative Frequency: A/C
  - Conditional Relative Frequency: A/B
- Side-by-Side Bar Graph: Bar graphs showing the distribution of a categorical variable for each value of another categorical variable.
  - Tip: It is acceptable for the bars to touch here within each of the values for the categorical variable, but leave spaces in between the distributions for each value.
- Segmented Bar Graph: Distribution of a categorical variable as segments of a whole (bars stacked on top of each other and proportional to relative frequencies).
  - Uses relative frequencies.
  - Tip: The bars do not touch here!
- Mosaic Plot: Similar to segmented, except the width of the bars proportional to number of individuals in that category.
  - Tip: The bars do touch here!

*Note: Tables are not data, they are summaries of data!

Avoid Bad Statistics Practices
- Truncating axes.
- Using pictograms.
Association
- Knowing the value of one variable allows you to predict the value of the other.
- Association does NOT equal causation!

1.2 Displaying Quantitative Data with Graphs

Types of Graphs
- Dotplot: Each data value is shown as a dot above its location on a number line.
  - Pros: Can see every individual value, easy to see shape.
  - Cons: Difficult to make with large data sets.
- Stemplot: Separates each data value into a stem (all but the final digit) and a leaf (the final digit).
  - Tips: Always add a key; Split stems to better see distribution if needed (try to get at least 5 stems); Make sure each stem has an equal number of possible leaf digits.
  - Pros: Can see every individual value, easy to see shape.
  - Cons: Difficult to make with large data sets.
- Histogram: Groups data into bins (intervals) and shows the frequency (or relative frequency) of values within each bin.
  - Tips: Equal size bins; Try to go with a minimum of 5 bins; The bin size will affect the appearance of the distribution (more bins -> more detail but less clear overall pattern); Edges of bins either inclusive or noninclusive (typically the right edge of a bin is noninclusive); Bars touch; Either frequencies or relative frequencies (but need to indicate which one!).
  - Pros: Easier to make for large data sets, easy to see shape (especially for large data sets–simplifies the overall pattern).
  - Cons: Doesn’t show every individual value.
    *Use relative frequencies if you’re using the histogram to compare distributions with different numbers of observations.
- Boxplot: Represents the five-number summary (minimum, Q1, median, Q3, maximum) and any outliers.
  - Pros: Easy to make for large data sets, shows five-number summaries, splits data into quartiles.
  - Cons: Doesn’t show individual values, slight skewness.

1.3 Describing Quantitative Data with Numbers

Measures of Center
- Mean: Average (\bar{x} or µ).
- Median: Middle value.
- Mode: Most common value.
Measures of Variability
- Range: Maximum – Minimum.
  - Pros: Easy to calculate.
  - Cons: Nonresistant and doesn’t express variability from the center.
- IQR: Interquartile Range (middle 50% of values).
  - Pros: Resistant.
  - Calculating quartiles: Split each half (leave out median!).
- Standard Deviation: Typical distance from the mean (s_x or σ).
  - sx = \sqrt{\frac{\sum(xi - \mu_x)^2}{n-1}}
  - To calculate: calculate all the deviations (value – mean), square each, add up, divide by n-1, take square root.
  - Properties: Only use in tandem with the mean; s_x always greater than or equal to σ.
  - Cons: Nonresistant.
Describing Quantitative Distributions: SOCV
- Shape: Skewness & modality, and any clusters/gaps.
- Outliers: 1.5 x IQR (above Q3 or below Q1).
- Center: Measures of the typical value: mean/median/mode.
- Variability: Range/IQR/st dev.
- Always add context! (variables & units).
Comparing Mean & Median
- Mean: Nonresistant.
- Median: Resistant (not sensitive to skewness/outliers).
Outliers
- 1.5 x IQR (above Q3 or below Q1).
- Five Number Summary: minimum, Q1, median, Q3, maximum
- Also good to know: upper & lower bounds (1.5 x IQR) (might not be actual data values)

*Decide which measures to use based on whether resistancy is a concern for the distribution

Additional Vocab
- Statistic: A value describes a characteristic of a sample.
- Parameter: A value that describes a characteristic of a population.

Types of MCQs (or FRQs)

Compare mean & median (use knowledge of resistancy & skewness to answer).
Interpret graphical or tabular representation of data:
- Using it to answer questions about the variable/distribution of the variable.
Interpret summary statistics/a value about the distribution:
- Put it in context (variable & units!).
- Especially interpreting standard deviation: “the typical distance from the mean”.
Match data with its graphical representation OR match graphical representations of the same data.
Determine whether there is an association between two categorical variables given data: graphical/tabular representations OR summary statistics:
- Calculate distribution of one categorical variable for each value of the other (basically a whole bunch of conditional relative frequencies).
- And then find whether knowing the value of one variable allows you to predict the value of another (eg. are those distributions you calculated the same or not).
Describe a distribution (quantitative):
- SOCV (shape, outliers, center, variability) (see content overview for more information).
- Context (variables & units of the distribution).
Compare distributions (quantitative):
- SOCV for both distributions.
- Use explicitly comparative language that relates the two distributions.
Make a claim/argument based on a distribution (quantitative):
- Refer to specific characteristics (eg. SOCV) of the distribution in your answer.
- Give specific numbers as much as possible.
- Context (variables & units of the distribution).
- Then, explain why those characteristics support your claim/argument.
Describe a distribution/compare distributions/make a claim based on a distribution (categorical).
Construct a certain type of graph for given data:
- Follow appropriate guidelines for the type of graph (see content summaries for tips).
- Always label & scale axes appropriately! with units!
- Add a title with context & others.
What is apparent from the histogram but not from the boxplot?
Misrepresenting/manipulating data:
- Why would it be misleading to only report [insert statistic/parameter here]?
- What would you want to report in order to [achieve specified goal]?
Why does one method for determining outliers give you more outliers than the other?

Language & Wording / General Common Mistakes

Language & Wording
- Always include context: distribution & variables & units.
- Describing distributions: “appears to be” / “approximately” (bc you cannot be sure).
  - Ex. “approximately normal” & “roughly symmetric” (this is a very important one!).
- Be VERY careful with relative frequencies vs. frequencies / raw counts! (this is a very important one!).
  - Use relative frequencies with groups of different sizes!
  - & say “a greater percentage” not “more”.
  - Plurality vs majority.
  - Always indicate which one you’re using.
- For histograms & boxplots: keep in mind that you can’t conclusively determine what the values are.
  - Esp for histograms: need to say that a value is in a certain bin (“between [value] and [value]”).
Common Mistakes
- Range, IQR, and st. dev. are single values! not a range of values.
- Avoid bad statistics: truncated axes & pictograms.
- Association is not causation!

Chapter 2: Modeling Distributions of Quantitative Data

2.1 Describing Location in a Distribution

Percentile: pth percentile is value with p% observations less than or equal to it.
- Works well with a frequency table of quantitative data.
Cumulative Relative Frequency Graphs / Ogives: Plots points corresponding to the percentile of a value in the distribution & points connected with line segments to create the graph.
- Another way to describe location in a distribution.
Standardized Scores (z-scores): How many standard deviations from the mean a value is (& what direction).
- (value – mean) / st dev
- Allows for a standard scale to compare values from different distributions.
- A way to describe location in a distribution.
Transforming Data
- Adding/Subtracting Constant: Affects measure of center/location (not shape/variability).
- Multiplying/Dividing Constant: Affects measures of center, location, variability (not shape).
- Multiple Transformations: Follow order of operations.
- Transformations Related to Z-scores: In a distribution of z-scores, shape remains the same as original distribution, mean always 0, standard deviation always 1.

2.2 Density Curves & Normal Distributions

Density Curve: Simplified model of a distribution of a quantitative variable
- Always on or above horizontal axis, has an area of exactly 1 underneath it.
- Always an approximation of data (not an exact model).
- Models continuous data but often used to approximate discrete distributions as well.
- Describing Density Curves:
  - Shape: same ways as usual
  - Center: mean (balance point) (µ) & median (divides area of curve in half) (if symmetric, they’re the same)
  - Variability: same measures as usual (σ)
Normal Distributions: Bell shaped & symmetric & unimodal distribution
- Approximated with a normal curve (density curve).
- Can fully be described by mean (same as median) (µ) and standard deviation (σ).
- Useful for: real data, chance outcomes, inference methods
Empirical Rule: 68% (within 1σ of µ) - 95% (within 2σ of µ) - 99.7% (within 3σ of µ) (for normal distributions)
- Standard Normal Distribution: distribution of z-scores (mean 0, st dev 1)
Finding areas in Normal Distributions
- Empirical Rule (when applicable).
- Find z-score & use table a to look up p-value (percent of values to left of z-score): Table a connects z-scores to percentiles in a normal distribution.
- Technology (see calculator functions)
Types of Problems: Area to left, area to right, area between, working backwards (z-score given area)
Assessing Normality
- Plot data (see if it looks normal)
- Check against empirical rule: Check amount of data within 1, 2, and 3 st dev from mean (w/in 3-5% is pretty good!)
- Normal Probability / Normal Quantile Plot:
  - Plots actual z-score (x) vs predicted z-score if it was normal (y).
  - Look for a straight-ish line on the normal probability plot
    *ideally use all three methods to check!

Types of MCQs (or FRQs)

Interpret percentile or z-score (see content summary for info):
- Provide specific values for the percentages / mean & st dev.
- Use cumulative relative frequency graphs to determine percentile.
Describe how distribution of data will change with a given type of transformation.
Find area in a normal distribution (see content summary for info on how).
Use percentiles or z-scores to evaluate claims about data:
- Find & interpret percentile / z-score
  - Percentile: value with p% of observations less than or equal to it
  - Z-score: abt that many standard deviations above / below the mean
- Draw conclusion using percentile / z-score
Normal Distribution Questions:
- Picture
  - Draw normal curve
  - Label specific distribution (context & mean / st dev)
  - Label boundary values & shade area of interest
- Calculate z-score(s)
- Calculate p-values using table a or calculator

Language & Wording/General Common Mistakes

Overall: Units 1 & 2 & 3

If you use calculator, ALWAYS LABEL VALUES: State answer in context
Is it extrapolation / is the answer reasonable?
Predict value & comment on whether it’s reliable:
- E: correct prediction, plugged into formula for LSRL to get predicted value, say whether it is reliable or not (extrapolation)
ALWAYS ADD CONTEXT esp when describing location in a distribution
Percentile: “at” a percentile NOT “in” a percentile (bc it’s a location!)
Z-scores: always provide the direction! not just “away” from the mean (and provide context (units & distribution / variables)!
Normal Distributions:
- Be careful with direction & tails (two-sided vs one-sided)
- Distributions of real world data always “approximately normal” (never perfect)
Quantitative Data:
- Discrete & Continuous
- One-Var:
  - Tabular: frequency table, relative / cumulative frequency table
  - Graphical: dotplot, stemplot, boxplot, histogram, cumulative relative frequency graph
  - Numerical: 5-number summary, center, variability, percentile / z-score
- Two-Var:
  - Graphical: scatterplot
  - Numerical: r, r2 , LSRL, s
- Simplified model of data: density curves
Categorical Data:
- Nominal & Ordinal
- One-Var:
  - Tabular: frequency table, relative frequency table
  - Graphical: bar chart, pie chart
  - Numerical: proportions, etc
- Two-Var:
  - Tabular: two-way table (frequency OR relative frequency)
  - Graphical: side-by-side bar chart, segmented bar chart, mosaic plot
  - Numerical: proportions, association, etc

Chapter 3: Exploring Two-Variable Quantitative Data

3.1 Scatterplots & Correlation

Scatterplots: explanatory x-axis, response y-axis; label & scale axes (you CAN truncate the axes here)
Describing Scatterplots: CDOFS
- Context: state variables & units
- Direction: pos / neg / no correlation
- Outliers / unusual features: outliers & points outside the general pattern / clusters
- Form: linear / nonlinear
- Strength: r (correlation coefficient) / r2 value (measures whether LSRL is a good fit)
r value: measures strength & direction (ONLY for linear models)
- Cautions
  - r is nonresistant
  - only for linear
  - correlation is not causation!
  - no units
  - unaffected by changing units / changing explanatory & response variables
- | r | less than 0.5: weak, | r | between 0.5 and 0.75: moderate, | r | greater than 0.75: strong
Extrapolation: using a regression line to make predictions way outside of the interval of x-values used to generate the line (beyond the scope of your data)
- won’t be accurate bc it might not remain linear at such extreme points

3.2 Linear Regression

Regression Line: model of how response variable (y) changes as explanatory variable (x) changes
Residuals: actual value – predicted value (based on line)
Least-Squares Regression Line: line that minimizes sum of squared residuals
Explanatory & Response Variables: not necessarily causation (though it could be), just which helps to explain the other
\hat{y} = a + bx (y-hat is predicted y-value for a given x-value); predicting so it’s okay if y-hat is not an integer for a real world situation (think of it as an average)
A good linear regression line: minimizes the residuals; sum of residuals on an LSRL is always 0
Residual Plots: scatterplot that plots residuals against explanatory variable
- Determines whether a linear model is appropriate (check for random scatter & no leftover curved pattern)
s: standard deviation of residuals
- How well does the line work? -> how good will predictions be?
- Measures typical residual (distance between predicted & actual)
- Calculate in the same way as st dev but divide by n-2
r2: coefficient of determination (value between 0 and 1, usually expressed as a percentage)
- Square of correlation r
- When finding r from r2, make sure to consider direction of correlation!
- How well the LSRL fits the data: percent reduction in sum of squared residuals when using LSRL instead of mean to make predictions
- What percent of the variability in the response variable that can be explained by the linear association
Regression to the Mean
- (to calculate LSRL): slope: b = r (sy / sx ); y-int: a = \bar{y} - b(\bar{x})
- since LSRL passes through (\bar{x}, \bar{y})
Correlation & Regression Wisdom
- Correlation and LSRLs only describe linear relationships
- r, s, r2 , and LSRL are non-resistant (see influential points)
Influential Points: points that, if removed, substantially change the slope, y-int, r, r2 , or s
- These are very often influential (but not automatically guaranteed to be)
- Can do regression calculations with & without the points to see how much influence they have
Outliers: doesn’t follow pattern of data and has a large residual
High Leverage: much larger / smaller values than other values in data set
Removing High-Leverage Points
- Lower than line & slope negative: slope closer to 0 & y-int lower
- Lower than line & slope positive: slope steeper & y-int lower
- Higher than line & slope negative: slope steeper & y-int higher
- Higher than line & slope positive: slope closer to 0 & y-int higher
- and sometimes effects on r, r2 , or s (use the point to evaluate this)
Removing Outliers
- Impacts r, r2 , and s values heavily: Usually makes them go up because strength of association & fit of LSRL way higher
- Doesn’t generally impact LSRL though

3.3 Transforming to Achieve Linearity

Applying a function to a quantitative variable (changes the scale of measurement) in order to make the scatterplot more approximately linear (in order to use linear regression methods)
Transforming with powers & roots (for power models: y = ax^p)
- Option 1: Raise values of x to power p (graph (x^p, y) (it will be linear))
- Option 2: pth root of values of y (graph (x, \sqrt[p]{y}) (it will be linear))
- When p is known: use the above methods; When p is unknown: guess & check OR use log (more universal & works for unknown power models)
Transforming with Logs (for power OR exponential models)
- Apply log transformation (log10 or ln)
- For power models (y = ax^p): use a log-log (both variables)
- For exponential models (y = ab^x): take log of y-var
- To choose a model: most random scatter (tiebreaker highest r2 value)

Types of MCQs (or FRQs)

Slope: for every [increase / decrease] in one [unit of x], there is a predicted [increase / decrease] in [units of y]
Y-int: when the [context of x] is 0 [units], the predicted value of [context of y] would be [y-int]
r-value / correlation coefficient: the correlation coefficient of [r] indicates that there is a [strong / moderate / weak], [positive / negative] correlation between [context of x] and [context of y]
s (standard deviation of residuals): on average, the model mispredicts [context of y] by [s units] using the LSRL
r2 value: [r2] percent of the variability in [context of y] can be explained by the linear association with [context of x]
Residual plot: the residual plot [is randomly scattered / has a pattern], indicating that a linear model [is / is not] appropriate
Effect of outliers / high leverage points on measures of strength or the LSRL
- What would happen when they are removed?
Can you infer causation based on correlation? (no) (might not be worded like this directly though)
Which is explanatory & which is response?
Interpret a feature of the association / regression line
LSRL in general: for every increase in [1 unit of x], there is a predicted increase in [b units of y] above / below a [unit of y] of [a – the y-int]

Tips/Common Errors

Be very careful with predicted vs actual values
- Remember to add a hat on top of any predicted value! And ALWAYS SAY THEY’RE “PREDICTED”
When defining LSRL: always define variables & add units for x & y
- Particularly important when data has been transformed to get the LSRL
With transformed data: be mindful of units & always convert back to “regular” units where appropriate / needed!
With residuals: pay attention to whether the question is asking for predicted residual (from LSRL / LSRL equation) or actual residual (from residual plot / graph)!
Can’t go backwards with LSRL & predict x value given a definite y value (can find x value from y-hat though)
How to tell if linear model is a good fit: high r2 value, s is small relative to the data

Chapter 4: Collecting Data

4.1 Sampling & Surveys

Sampling: selecting a random group of people out of a whole population (that’s representative of the population)
Sampling Frame: the group of members from the population from which we select our sample
Sampling Survey: collects data from the individuals in the sample (to learn about the population)
Types of Sampling
- Random Sampling: involves a chance process to determine which individuals are in the sample
  - SRS: every group of n individuals has an equal chance of being selected (label individuals with numbers, random number generator, select individuals that correspond: sample without replacement! don’t include repeated numbers; calculator: math -> prob -> randintnorep(1, N) OR use table d
  - Stratified: SRS selected from each strata; Strata: group w similar characteristics assumed to be associated with the variables being measured: ensures you get some from each strata (more precise & accurate estimates)
  - Clustered: randomly selecting entire clusters; Clusters: diff responses between (hopefully representative of population): no statistical advantage but resource-efficient
  - Systematic: randomly select starting point & select every kth individual after: make sure no patterns coincide with your systematic pattern; allows you not to have to have identifiers (eg. names) for all the individuals in the population (useful w unknown population size)
    *Decide on sampling type based on population & variable & resources available to you
- Bad Sampling
  - Convenience Sampling: individuals who are easy to reach
  - Voluntary Response Sampling: allows individuals to choose to be in sample: leads to voluntary response bias; individuals who feel strongly / have similar opinions more likely to respond
Bias: likely to systematically overestimate or underestimate the value
- Undercoverage: certain individuals less likely / cannot be chosen in a sample
- Nonresponse: individual chosen for sample can’t be contacted / doesn’t participate: diff from voluntary response bias bc this is only after sample has been selected
- Response: systematic pattern of inaccurate answers to a survey question; ex. wording or order of questions
Uses resources more efficiently than a census

4.2 Experiments

Observational Studies: observes individuals & measures variables of interest (does not influence responses); Retrospective: existing data; Prospective: tracks into future; pros: ethics; cons: confounding variables (cannot determine causality with observational studies)
Experiments: imposes a treatment on individuals & measures their responses; pros: can establish causality bc avoids confounding; cons: resources, ethics important vocab:
- Placebo: no active ingredient; optional: placebo effect
- Treatment: condition imposed on individuals
- Experimental Unit: individual to which treatment applied: Subject: human experimental unit
- Factor: explanatory var that’s manipulated (may cause change in response var); treatments formed using levels of each of the factors (in a multifactor experiment)
- Levels: diff possible values of a factor
Confounding: when variables are associated so that their effects on a response variable can’t be distinguished from one another. Avoids confounding variables
Control Group: provides a baseline for comparison; not always required but there does need to be some sort of comparison
Random Assignment of Treatments: avoids confounding variables
Control: all other variables constant; avoids confounding variables
Replication: use enough subjects (diff in effects can be distinguished from chance variation). Avoids confounding variables & reduces variation
Double Blind: neither subjects nor the ppl measuring know the treatment; Triple Blind: statistician doesn’t know either
Single-Blind: only one of the groups (above) knows
Selection Biases: voluntary, convenience, undercoverage; The other two: nonresponse, response

Designing Experiments (required in a good experiment)

Completely Randomized Design: experimental units assigned to treatments completely at random
Randomized Block Design: random assignment within each block: Form blocks based on: confounding variable / the variable that’s the best predictor of the response variable; It accounts for variability in [response var] created by [block var]
Block: group of experimental units known to be similar in some way that could affect their response to the treatments
Matched Pairs Design: a type of RBD where blocks are pairs; pairs of similar experimental units, either true pairs & randomize treatment or each individual receives both treatments in a random order: Can establish causality

4.3 Using Studies Wisely

Statistical Significance: observed diff is larger than can be attributed to chance alone: Ensured by random assignment of treatments
Statistical Inference: generalising results to population: Assuming sample is representative of population (ensured by random sample)
Ethical Data Gathering: informed consent, benefit
Sampling Variability: diff random samples (same size, same population) produce diff estimates
Larger sample sizes produce more accurate estimates (closer to true value)

Types of MCQs (or FRQs)

Is it an observational study or an experiment?
Describe the type of bias present
- Describe how members respond differently, then describe how this leads to overestimated / underestimated values
Describe a confounding variable
- Describe that variable’s association with both explanatory AND response vars
Describe an appropriate experiment design
- Describe how to randomly assign treatments
  - Create groups AND define which group gets which treatment
- Describe why certain experiment designs would be preferable. Draw a diagram to explain the experiment design

Tips/Common Errors

Describing how to use a random number generator / table: ALWAYS account for repeated numbers
If you flip coins: make sure everyone flips a coin (not just until you have one group & then put the rest in another. this would not be random sampling)
Be really careful not to mix up language for experiments & language for observational studies!

Chapter 5: Probability

5.1 Randomness, Probability, and Simulation

Random Process: generates outcomes purely by chance; unpredictable in short-term but predictable in the long run
Probability: likelihood of an event to happen; proportion of times an outcome would occur in the long run; P(A) = (number of outcomes in event A) / (total number of outcomes in sample space)
Law of Large Numbers: more trials means proportion approaches true probability (more accurate)
Simulation: imitates random process such that simulated outcomes are consistent with real-world outcomes: Options: random number generator, random number table, flipping coin, draw cards, etc; describing simulation: remember to say that repeated numbers will be ignored!; for random number table: remember that numbers need to be the same length digit-wise

5.2 Probability Rules

Probability Model: description of a random process that includes a list of all possible outcomes & the probability for each outcome- Sum of all probabilities is 1 (or 100%), each probability is between 0 & 1 (or 0% & 100%)
Sample Space: list of all outcomes: options: chart, table, venn diagram, probability tree
Event: any collection of outcomes from a random process- Notation & vocab: P(A) = (number of outcomes in event A) / (total number of outcomes in sample space)
Complement: the probability that an event does not occur P(AC) = 1 – P(A)
Intersection: P(A and B) = A ∩ B (both A and B must be true) joint probability
Union: P(A or B) = A ⋃ B (at least one–either A or B, or both–must be true)
Mutually Exclusive Events: cannot occur simultaneously (no outcomes in common) (also known as disjoint) if P(A and B) = 0:
- Addition Rule: P(A or B) = P(A) + P(B)
Non-Mutually Exclusive Events: can occur simultaneously: use a two-way table!
- General Addition Rule: P(A or B) = P(A) + P(B) – P(A and B)

5.3 Conditional Probability & Independence

Conditional Probability: probability that an event happens given that another event is known to have happened: P(A | B) FORMULA: P(A ∩ B) / P(B) Tree diagrams are useful with conditional