By Addison Wood || AP Stat student 2025
Please note that this exam sheet is not broken down into the typical units that AP review breaks it down into. These are the units used by stats medic and go more in depth to each section. NOT FINISHED
Terms to know:
Quantitative variable: takes numerical values for a measured or counted quantity
Discrete: countable number of values
continuous: infinite values
Categorical variable: takes on values that are category names or group labels
these are typically used by bar graphs
Mean: mean is the average of the data set
Median: the middle value in the data set; when the data is arranged in ascending order; if there is an even number of observations, the median is the average of the two middle values.
Association: If knowing the value of one variable helps in predicting the value of another variable, there is said to be an association between them.
1.2: Representing Categorical Data
When given 2 categorical variables, here is the best way to represent them
Segmented bar graph: stack up bars to make 100%
Mosaic Plot: Segmented bar graph were the width of the bars is proportional to the group size.
1.3: Describing Quantitative Data
Dot plots & Stem plots: show every individual value
Histogram shows general shape
4 shapes:
Skewed left: looks like your left foot
Skewed right: looks like your right foot
Symmetric: looks like a hill
Bimodal: looks like a two humped camel
To describe:
S: shape (see above)
O: outliers (are there any points that don’t “fit in” with the distribution)
C:center (median / mean)
V: variability (standard deviation)
When describing always use context. Context makes the problem real. Try to use “ly” words so your words aren’t held against you.
1.4: Measuring variability:
When asked to interpret Standard Deviation: “The context typically varies by Standard Deviation from the mean of x.”
If asked for variance, square the standard deviation
Outliers: these greatly affect the mean and the standard deviation, meaning they are nonresistant. The median however is not affected, so it is resistant to any present outliers.
If the distribution is symmetric use mean and standard deviation
If the distribution is skewed or has outliers use the median and IQR
1.5: Comparing Quantitative Data
To find outliers:
1.5 x IQR
low outlier <Q1 - 1.5IQR
high outlier > Q3 + 1.5IQR
Boxplots show the 5 number summary
To compare use SOCV + context (see 1.3)
2.1: Percentiles
the nth percentile is the value that has n% of the data less than or equal to it
key term is “at” a certain percentile; for example, if a student scores at the 80th percentile, they performed better than 80% of the peers who took the same assessment.
Cumulative Relative Frequency Graph:
Q1 is the 25th percentile
Q3 is the 75th percentile
Median is the 50th percentile
2.2: Location in a Distribution
Z scores: (value - mean) / standard deviation
“Context is z-score standard deviations above / below the mean
Z-scores show position relative to other values in the distribution
2.3: Linear Transformations of Quantitative Data
if adding or subtracting by a constant
shape is the same
center + or - constant
variability is the same
if multiplying or dividing by a constant
shape is the same
center is x or / by constant
variability is x or / by constant
2.4: Normal Distributions and the Empirical Rule
Density Curves:
total area = 1
area = a proportion of values
Skewed right: mean > median
Skewed left: mean < median
Symmetric: mean = median
Empirical Rule:
68% of data is found within 1 SD of the mean
95% within 2 SDs
99.7% within 3 SDs
2.5 Normal Distribution Calculations:
Step 1: find z score
Step 2: use table A by finding the proportion of the z score
If given a z score, back calculate using your algebra skills
Terms to know:
explanatory variable: explains response variable
response variable: measures outcome
3.1: Scatterplots
Scatterplots: explanatory variable on the x axis, response variable on the y axis.
To describe a scatterplot relationship: DUFS + context
Direction (positive / negative / none)
Unusual features (outliers / clusters)
unusual points can strengthen r if in pattern
unusual points can weaken r if out of pattern
Form (linear or non-linear)
Strength (how close to the form)
& context always!
3.2 Correlation
Correlation R:
direction (+ or -)
form: linear
strength: between -1 and 1
“The linear relationship between x and y is strength and direction”
Coefficient of determination r²
“The percent of the variation in y is explained by the linear relationship with x.”
NOTE: CORRELATION DOES NOT EQUAL CAUSATION
3.3 Making Predictions
Predictions:
y= a + bx ; where y = predicted y; a = y int; and b = slope
** be cautious with extrapolation
Residuals:
residual = actual - predicted → R = A-P
“The actual context was residual above / below the predicted values for x = #”
Interpretation:
“When x = 0 context, the predicted y-context is y-int.”
“For each additional x-context the predicted y context increases/decreases by slope.”
3.4 Residual Plots
LSRL
the least squares regression line minimizes the sum of the squared residuals
Residual Plots:
randomness is good 🙂
patterns are bad ☹
tree shapes
frowny faces
smiley faces
3.5 Outliers, High Leverage, and Influential Points
Outliers: out of pattern (large residuals)
High Leverage: very large or very small x-values
Influential: if removed, big changes to slope, y-intercept, r
Outliers & LSRL:
Horizontal Outlier → tilt the line
Vertical Outliers → shift line up or down
3.6 Transforming Non-linear Data
transformations:
original data to transform to linear
linear → graph x vs y
exponential → graph x vs log y
power → graph log x vs log y
3.7 Choosing the Best Model
Check the scatterplot for a linear pattern
check the residual plot for no leftover pattern
check for the r2 that is closest to 1
4.1 Simple Random Samples
convenience samples and voluntary response samples can lead to bias
SRS limit bias
All of these are taken from the population
To take a simple random sample:
Label individuals (ex: assign numbers or slips of paper)
Randomize (ex: random number generator or names in a hat)
Select
4.2 Stratified Random Samples
Stratified Random samples happen by splitting the population into groups (strata) then choosing an SRS from each strata.
Note: each strata has individuals with shared attributes or characteristics (homogenous grouping)
4.3 Cluster and Systematic Random Samples
Cluster samples are when the groups are split into strata, but instead of sampling a few from all groups, a few groups are sampled as a whole. Groups are heterogenous in this method, meaning that the groups do not need to be similar.
Systematic Random Sample is when a random starting point is chosen and a sample is taken at every nth individual until the sample size is met.
4.4 Potential Problems With Sampling
Under coverage: some people are less likely to be chosen
calling landlines, surveying homeowners
Non response: people cannot be reached or refuse to answer
don’t answer or hang up on phone call
Response bias: problems in the data gathering instrument or process
people lie (self reported responses) or the wording of the question
4.5 Observational Studies and Experiments
If a confounding variable is present, that means there is something present that is related to the explanatory variable that influences the response variable
Observational study: no treatment imposed
Experiment: treatments imposed, which allows us to show causation
Experimental Units: what/who treatment is imposed on
Treatments: what is done (or not done) to experimental units; levels or a combination of levels of the explanatory variable
Control group: a baseline group that does not receive the treatment, used for comparison
Random assignment: the process of assigning experimental units to treatments at random, minimizes bias.
4.6 Designing Experiments
A well designed experiment has 4 key features:
Comparison → there are two or more treatments imposed
Random assignment → this allows us to draw causation
label
randomize
assign
Replication → there is more than one individual in each treatment group
control → keeps the other variables constant and allows for a basis of comparison
The placebo effect is when a fake treatment works
Blinding is when subjects (single blind) and / or experimenters (double blind) don’t know about treatments
4.7 Selecting and Experimental Design
Block Design:
Blocks are a group of experimental units that are similar
Randomized Block Design is when subjects are separated into blocks and then randomly assigned treatments within each block.
Matched Pairs Design:
subjects are paired (blocks of size 2) and then randomly assigned to a treatment
each subject receives two treatments
the order of the treatments must be randomized
4.8 Interference and Experiments
Statistically significant
when results of an experiment are unlikely (less than 5%) to happen purely by chance
if statistically significant results are obtained, there is convincing evidence the treatment caused the difference.
4.9 Scope of Interference
Random Sample allows us to generalize our conclusions to the population from which we sampled.
Random Assignment allows us to conclude a treatment causes changes in the response variable
5.1 Introducing Probability
A probability is a long run frequency.
always between 0 and 1, where 0 indicates impossibility and 1 indicates certainty.
short run is unpredictable
long run is predictable
Law of Large Numbers:
simulated probabilities tend to get closer to the true probability as the number of trials increases
5.2 Simulation
Simulation is a way to model random events, such that simulated outcomes match real world outcomes.
ex: dice roll, coin toss, applet, random number generator
Evidence for a claim:
assuming a claim is true, find the probability of getting the observed result or more extreme
< 5%
statistically significant
convincing evidence against the claim
5.3 Rules for Probability
Sample Space: List of all possible outcomes
P(x) = (# of outcomes in X) / (total # of outcomes in sample space)
ALL OUTCOMES MUST BE EQUALLY LIKELY
Rules
Complement Rule: P(not A) = 1- P(A)
Notation:
P(A and B) = P (A n B) → both occur
P(A or B) = P (A u B) → one or the other or both
5.4 The Addition Rule
Two way tables and venn diagrams are used a lot, so make sure you are able to interpret and apply them
Addition Rule
P(A or B) = P(A) + P(B) - P(A and B)
if events A and B are mutually exclusive, they cannot occur together meaning
P(A and B) = 0 therefore P(A or B) = P(A) + P(B)
5.5 Conditional Probability
Conditional Probability is the probability of one event given another event has occurred
P(A | B) = P(A and B) / P(B)(the given condition)
Independent events:
knowing whether or not one event occurs does not change the probability of the other event
If P(A)= P(A|B)= P(A| not B) → A and B are independent
5.6 Tree Diagrams
General Multiplication Rule:
P(A and B) = P(A) x P(B|A)
IF A AND B ARE INDEPENDENT
P(B|A) = P(B) so…
P(A and B) = P(A) x P(B)
Tree diagrams:
the probability must add up to one.
first branches is probability of of event happening / not happening
next branches is the probability of that event happening / not happening given an event has already happened
6.1 Discrete Random Variables
Probability distribution always adds to one
Discrete Random Variable: takes a countable number of values with gaps between
6.2 Continuous Random Variable
Random Variables:
discrete: has a countable number of values with gaps
continuous: has infinite values with no gaps
Probabilities:
find the area under the curve
uniform:
1/k → y axis
k → x axis
b x h
Normal curves:
table A
6.3 Transforming Random Variables
Add / Subtract a constant c:
shape: stays the same
center (mean or median): add / subtract c
variability (range, IQR, SD, variance): stays the same
shape: stays the same
center: multiply / divide by c
variability:
range, IQR, SD: multiply / divide by c
variance: multiply / divide by c2
6.4 Combining Random Variables
Combining Random variables
DO NOT ADD STANDARD DEVIATIONS
Addition
Mx+Y = Mx+ MY
SDx+Y = sqrt(SDx2+ SDY2)
Subtraction:
Mx-Y = Mx- MY
SDx-Y = sqrt(SDx2+ SDY2)
Normal Distribution calculations
use formulas to find M + SD
proceed as usual
6.5 Introduction to the Binomial Distribution
Binomial random variable:
x → number of successes
Binary? Is there success or failure
Independent trials?
Number of trials is fixed? (n)
Same probability of success (p)
Binomial formula:
P(x=k) = nCkpk(1-p)(n-k)
nCk→ the # of ways to get k successes
pk→ probability of successes where k= # of successes
(1-p) → probability of failure
n-k → # of failures
6.6 Parameters for Binomial Distribution
Using technology:
P(x=k) → binompdf (n,p,k)
P(x <= k) → binomcdf (n,p,k)
Mean + Standard Deviation for binomial
M=np “
After many, many groups of n trials, the average number of successes is M”
SD = sqrt of (np(1-p))
“The number of successes typically varies by a SD from the mean of M”
6.7 Conditions for Inference
10 % condition:
when taking a random sample (without replacement) of size n from a population of size N we can use a binomial distribution if n <= .10N.
Large Counts Conditions:
use a normal distribution to model a binomial distribution if np >= 10 and n(1-p) >= 10
np = # of successes
n(1-p) = # of failures
6.8 The Geometric Distribution
BITS:
Binary: success or failure
Independent trials
Trials until success
Same probability of success
Geometric formula: P(X = k) = (1 - p)^{k-1} p, where p is the probability of success and k is the number of trials.
shape: skewed right
center: M = 1/p
Variability: SD = (sqrt of (1-p)) / p
7.1 Sampling Distributions
A statistic is used to estimate a parameter
Statistic → parameter
p hat → p
x bar → M
s → SD
Populations = parameters
Samples = statistics
Sampling Distribution:
The distribution of values for a statistic for all possible samples of a given size from a given population
7.2 Bias and Variability
Biased VS Unbiased Estimator:
Biased:
consistently overestimates or
consistently underestimates the true population parameter
Unbiased:
mean of the sampling distribution is equal to the population parameter
Sample Size:
as n increases, variability of the sampling distribution decreases
A good statistic has:
low bias (accurate)
low variability (precise)
7.3 Sample Proportions
Sampling Distribution of p hat:
shape: approximately normal is np >= 10 & n(1-p) >= 10
center: Mp-hat= p
variability: SDp-hat= sqrt(p(1-p)/n)
check 10% condition if sampling without replacement
Normal Probability:
z = (p-hat - p) / sqrt (p(1-p)/ n)
7.4 Differences in Sample Proportions
Sampling Distributions of p-hat1 - p-hat2
shape: approximately normal if
n1p1>= 10 & n2p2>= 10
n1(1-p1)>= 10 & n2(1-p2)>= 10
center:
M(p-hat 1 - p-hat 2)= p1 - p2
variability