explained in words ap star
6.3 Binomial and Geometric Random Variables
How do you know if something is binomial?
Ask yourself four questions:
Is there a fixed number of trials?
Are the trials independent?
Are there only two outcomes each time (success/failure)?
Is the probability of success constant?
If all four are yes, it’s binomial.
Example:
A basketball player takes 20 free throws and you count how many she makes.
That’s binomial.
How do you calculate binomial probabilities?
You identify:
number of trials
probability of success
how many successes you want
Then use calculator functions.
“Exactly”
→ use PDF
“At most”
→ use CDF
“At least”
→ subtract from 1.
Interpretation:
The answer is the probability that this number of successes occurs in repeated trials.
Mean and standard deviation of binomial
The mean tells you the expected number of successes.
Example:
If probability of success is .70 over 10 trials:
you expect about 7 successes.
Standard deviation tells how spread out the results usually are.
A small SD means results stay close to the mean.
A large SD means more variability.
Normal approximation to binomial
When the sample size is large enough, binomial distributions start looking Normal.
Check:
expected successes at least 10
expected failures at least 10
If both work, use a Normal model instead of exact binomial calculations.
You:
Find mean and SD
Convert values to z-scores
Use Normal probabilities
Geometric random variables
Geometric variables count how long until the FIRST success happens.
Example:
How many shots until first basket?
You use:
probability of failures first
then success
Key idea:
Binomial counts number of successes.
Geometric counts trials until first success.
7.1 Sampling Distributions
Parameter vs statistic
A parameter describes a population.
A statistic describes a sample.
Population:
all students in school.
Sample:
30 students surveyed.
The true school average is a parameter.
The sample average is a statistic.
Creating a sampling distribution
You:
Take every possible sample
Calculate the statistic for each
Look at the distribution of those statistics
This shows how statistics vary from sample to sample.
Using sampling distributions
You compare your observed statistic to what’s expected.
If your statistic is very unusual under a claim, the claim may not be true.
Population distribution vs sample distribution vs sampling distribution
Population distribution:
all individual values.
Sample distribution:
values from one sample.
Sampling distribution:
statistics from MANY samples.
This distinction is extremely important on AP Stats.
Unbiased estimator
A statistic is unbiased if, on average, it hits the true parameter.
Example:
Sample proportions tend to center around the true population proportion.
Sample size and variability
Larger samples give less variability.
Big samples are more stable and reliable.
Small samples bounce around more.
7.2 Sample Proportions
Mean and SD
The mean of sample proportions equals the true population proportion.
The SD tells how much sample proportions usually vary.
Larger sample size:
smaller spread.
Checking if approximately Normal
You check whether:
expected successes ≥ 10
expected failures ≥ 10
If yes:
sampling distribution is approximately Normal.
Calculating probabilities
Find mean and SD
Convert to z-score
Use Normal table/calculator
Interpret in context.
7.3 Sample Means
Mean and SD
The mean of sample means equals the population mean.
The SD measures how much sample means vary.
As sample size increases, variability decreases.
Shape of sampling distribution
If population is Normal:
sample means are Normal.
If population is skewed:
larger sample sizes make sample means more Normal.
This is the Central Limit Theorem.
Calculating probabilities
Treat the sampling distribution like a Normal distribution when conditions are met.
Use z-scores.
8.1 Confidence Intervals Basics
Point estimate
A point estimate is your best single guess for the parameter.
Example:
sample proportion estimates population proportion.
Interpreting confidence intervals
Correct wording:
“We are 95% confident the true parameter lies between…”
It does NOT mean there’s a 95% chance the parameter is there.
Margin of error
Margin of error tells how far the estimate might reasonably be from the true value.
Larger margin:
less precise.
Smaller margin:
more precise.
Using confidence intervals to decide
If a claimed value falls inside the interval:
it’s plausible.
If outside:
there’s evidence against it.
Confidence level
95% confidence means the method works about 95% of the time over many repeated samples.
Effect of sample size and confidence level
Larger sample:
smaller margin of error.
Higher confidence:
larger margin of error.
Bias issues
Even a mathematically correct interval can still be misleading if the sample is biased.
Examples:
nonresponse
undercoverage
response bias
8.2 Confidence Intervals for a Population Proportion
Conditions
Check:
Random
10%
Large Counts
If conditions work:
you can construct the interval.
Critical value
The critical value determines how wide the interval is.
Higher confidence → larger critical value.
Constructing interval
You:
Find sample proportion
Calculate standard error
Multiply by critical value
Add/subtract from estimate
Sample size
If you want a smaller margin of error:
increase sample size.
8.3 Confidence Intervals for a Population Mean
Critical value
Use t-distribution because population SD is usually unknown.
Smaller samples use larger t-values.
Conditions
Check:
Random
10%
population roughly Normal OR large sample
Constructing interval
You:
Find sample mean
Find standard error
Multiply by t-value
Create interval
Sample size
Larger sample gives more precision.
9.1 Significance Tests Basics
Writing hypotheses
Null hypothesis:
statement of no change/no effect.
Alternative:
what you’re testing for.
The null always includes equality.
Interpreting P-value
The P-value measures how unusual your data would be if the null were true.
Small P-value:
strong evidence against null.
Making conclusions
If P-value is smaller than alpha:
reject null.
Otherwise:
fail to reject null.
Always explain in context.
Type I and Type II errors
Type I:
false alarm.
Type II:
missed detection.
Know consequences in context.
9.2 Tests About Population Proportions
Conditions
Check:
Random
10%
Large Counts
Test statistic and P-value
You:
Compare observed proportion to claimed proportion
Standardize with z-score
Find P-value
Performing test
Always follow:
State → Plan → Do → Conclude.
9.3 Tests About Population Means
Conditions
Check:
Random
10%
Normal/Large Sample
Test statistic
Compare sample mean to claimed mean using a t-score.
Confidence interval relationship
If null value is outside the confidence interval:
reject null.
Power
Power measures ability to detect a real effect.
Power increases with:
larger samples
larger alpha
bigger effects
less variability
10.1 Comparing Two Proportions
Sampling distribution
Looks at difference between two sample proportions.
Center equals true difference.
Spread depends on both samples.
Conditions
Check conditions separately for both groups.
Confidence interval
Estimates plausible values for the true difference.
If interval contains 0:
difference may not exist.
If 0 not included:
evidence of difference.
Significance test
Compare observed difference to 0.
Find z-score and P-value.
10.2 Comparing Two Means
Sampling distribution
Looks at difference between sample means.
Conditions
Need:
Random
10%
Normal/Large Sample
for both groups.
Confidence interval
Estimate true difference in means.
Interpret in context.
Significance test
Compare observed difference to hypothesized difference.
Use t-score.
10.3 Paired Data
Analyzing paired data
Find differences first.
Then analyze the differences like one sample.
Confidence interval/significance test
Everything is based on mean difference.
Paired vs two-sample
Paired:
same subjects or matched subjects.
Two-sample:
independent groups.
11.1 Chi-Square Goodness of Fit
Hypotheses
Null:
distribution follows claimed model.
Alternative:
distribution differs.
Expected counts
Find what counts SHOULD be if null is true.
Chi-square statistic
Measures how far observed counts are from expected counts.
Big chi-square:
big differences.
Conditions
Need:
Random
10%
expected counts at least 5
Degrees of freedom and P-value
Degrees of freedom depend on categories.
Use chi-square distribution to find P-value.
Follow-up analysis
Look for categories contributing most to differences.
11.2 Two-Way Tables
Expected counts
Expected counts assume variables are unrelated.
Homogeneity vs independence
Homogeneity:
compare groups.
Independence:
check association in one population.
Choosing test
Ask:
one population or multiple populations?
That determines independence vs homogeneity.
12.1 Inference for Linear Regression
Conditions
Check LINE:
Linear
Independent
Normal residuals
Equal variance
Interpreting regression values
Intercept:
predicted y when x = 0.
Slope:
predicted change in y for each 1-unit increase in x.
Residual SD:
typical prediction error.
Standard error of slope:
how much slope estimates vary.
Confidence interval for slope
Estimates plausible values for true slope.
If interval contains 0:
relationship may not exist.
Significance test for slope
Tests whether slope is truly different from 0.
Small P-value:
evidence of a linear relationship.