AP Statistics Cumulative AP Exam Study Guide Notes
Statistics
- Science of collecting, analyzing, and drawing conclusions from data.
- Descriptive: organizing and summarizing statistics.
- Inferential: making generalizations.
- Population: entire collection.
- Sample: subset.
- Variable: changes in value.
- Data: observations.
Types of Variables
- Categorical (Qualitative): characteristics.
- Numerical (Quantitative): numerical data.
- Discrete: listable sets (counts).
- Continuous: any value (measurements).
- Univariate: one variable.
- Bivariate: two variables.
- Multivariate: many variables.
Distributions
- Symmetrical: same shape (Bell Curve).
- Uniform: equal frequency (rectangle).
- Skewed: one side is longer.
- Bimodal: two large frequencies.
Describing Numerical Graphs (S.O.C.S.)
- Shape: symmetrical, skewed, uniform, or bimodal.
- Outliers: gaps, clusters.
- Center: middle (mean, median, mode).
- Spread: variability (range, standard deviation, IQR).
- Context: in context.
- Comparative Language: when comparing.
- Parameter: population value.
- Statistic: sample value.
Measures of Center
- Median: middle point (50th percentile).
- Mean: \mu (population), \bar{x} (sample).
- Mode: most frequent.
Measures of Spread (Variability)
- Range: (Max - Min)
- IQR: (Q3 - Q1)
- Standard deviation: \sigma (population), s (sample).
- Variance: standard deviation squared.
Resistant Measures
- Resistant: not affected by outliers.
- Non-Resistant: Affected by outliers
- Mean
- Range
- Variance
- Standard Deviation
- Correlation Coefficient (r)
- Least Squares Regression Line (LSRL)
- Coefficient of Determination
^2
- Symmetrical: mean = median.
- Skewed Right: mean > median.
- Skewed Left: mean < median.
- Trimmed Mean: eliminate outliers.
Combination of Variables
Z-Score
- z = \frac{x - \mu}{\sigma}
Normal Curve
- Bell-shaped and symmetrical.
- Empirical Rule (68-95-99.7).
5-Number Summary
- Minimum, Q1, Median, Q3, Maximum
Probability Rules
- Sample Space: all outcomes.
- Event: any outcomes.
- Complement: not in the event.
- Union: A or B (A \cup B).
- Intersection: A and B (A \cap B).
- Mutually Exclusive: no intersection.
- Independent: one event doesn't change another.
Correlation Coefficient (r)
- Strength and direction of a linear relationship.
- Values: [-1, 1]
- Least Squares Regression Line (LSRL): \hat{y} = a + bx
- Residuals (error): y - \hat{y}. Residual Plot: no pattern = linear.
Coefficient of Determination \bf{r^2}
- Proportion of variation in y explained by (x, y).
Interpretations for LSRL
- Slope (b): unit increase in x, y increases/decreases by slope.
- Correlation coefficient (r): (strength), (direction), linear association.
- Coefficient of determination (r^2): r^2% of variation in y explained by x and y.
- Influential Points: change LSRL if removed.
- Outliers: large residuals.
Sampling
- Census: complete count.
- Sampling Frame: list of population.
- Sampling Design: method to choose a sample.
Types of Samples
- SRS (Simple Random Sample): equal chance of being selected.
- Stratified: divide into groups, then SRS each.
- Systematic: every 50th.
- Cluster Sample: based on location.
- Random Digit Table.
- Random # Generator: Calculator or computer program
Bias
- Error favoring an outcome.
- Sources: Voluntary Response, Convenience Sampling, Undercoverage, Non-response, Response, Wording.
Experimental Design
- Observational Study: observe outcomes.
- Experiment: imposes treatment.
- Experimental Unit: receives treatment.
- Factor: explanatory variable.
- Level: specific value for factor.
- Response Variable: what you measure.
- Treatment: experimental condition.
- Control Group: compare to factor.
- Placebo: no active ingredients.
- Blinding: subjects unaware.
- Double Blinding: neither subjects nor evaluators know.
Principles of Experimental Design
- Control: keep variables constant.
- Replication: use many subjects.
- Randomization: assign subjects randomly.
- Cause and effect: well-designed, controlled experiment.
Experimental Designs
- Completely Randomized: units allocated randomly.
- Randomized Block: units blocked, then assigned.
- Matched Pairs: units matched, then assigned.
- Confounding Variables: effect cannot be separated.
- Randomization: reduces bias.
- Blocking: reduces variability.
Random Variables
- Discrete: a count.
- Continuous: a measure.
- Discrete Probability Distributions.
- \\muX = \sum xi p(x_i)
- \\sigmaX^2 = \sum (xi - \muX)^2 p(xi)
Special Discrete Distributions
- Binomial Distributions: two outcomes, fixed trials, independent, same probability.
- \\mu_X = np
- \\sigma_X = \sqrt{npq}
- Geometric Distributions
- Poisson Distributions
Continuous Random Variables
- Normal Distributions: unimodal, bell-shaped curves.
Normal Distributions
- Use graphs – dotplots, boxplots, histograms, or normal probability plot.
Sampling Distributions
- Central Limit Theorem: n > 30, sampling distribution is approximately normal.
Confidence Intervals
- Steps: Assumptions, Calculations, Conclusion.
T-Distributions
- Compared to standard normal curve: Centered around 0, More spread out and shorter, More area under the tails
Hypothesis Tests
- Null Hypothesis: H_0
- Alternate Hypothesis: H_a
- P-Value: probability of observed result.
- Steps: Assumptions, Hypotheses, Calculations, Conclusion.
Type I and II Errors
- Type I Error: Reject H0 when H0 is true (probability is \\alpha).
- Type II Error: Fail to reject H0, and H0 is false (probability is \\beta).
Chi-Square Tests
- Goodness of Fit (univariate).
- Independence (bivariate