AP Statistics Unit 2 Notes: Understanding Linear Regression for Two-Variable Data

0.0(0)
Studied by 0 people
0%Unit 2 Mastery
0%Exam Mastery
Build your Mastery score
multiple choiceMultiple Choice
call kaiCall Kai
Supplemental Materials
Card Sorting

1/24

Last updated 3:08 PM on 3/12/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

25 Terms

1
New cards

Regression line

A mathematical model that describes how a response variable (y) tends to change as an explanatory variable (x) changes; summarizes an overall linear trend, not a perfect rule for every point.

2
New cards

Explanatory variable (x)

The quantitative variable used to explain or predict changes in another variable; the input of a regression model.

3
New cards

Response variable (y)

The quantitative variable being predicted or explained by x; the output of a regression model.

4
New cards

Least-squares regression line (LSRL)

The regression line that minimizes the sum of squared residuals; written as ŷ = a + bx.

5
New cards

Predicted value (ŷ, “y-hat”)

The value of y predicted by the regression equation for a given x.

6
New cards

Least squares

A fitting method that chooses the slope and intercept to make the overall vertical prediction errors as small as possible by minimizing the sum of squared residuals.

7
New cards

Residual (e)

The vertical prediction error for a point: e = y − ŷ.

8
New cards

Sum of squared residuals

The quantity minimized by the LSRL: Σ(y − ŷ)²; squaring prevents cancellation and penalizes large errors.

9
New cards

Slope (b) of the LSRL

The change in predicted y for a 1-unit increase in x; computed by b = r(sy/sx).

10
New cards

Intercept (a) of the LSRL

The predicted value of y when x = 0; computed by a = ȳ − b x̄; meaningful only if x = 0 is in a reasonable data range.

11
New cards

Correlation (r)

A measure of the direction and strength of linear association between x and y; its sign matches the sign of the regression slope.

12
New cards

Standard deviation of x (sx)

A measure of the spread of the explanatory variable x; used in the slope formula b = r(sy/sx).

13
New cards

Standard deviation of y (sy)

A measure of the spread of the response variable y; used in the slope formula b = r(sy/sx).

14
New cards

Point (x̄, ȳ)

The mean point of the data; the LSRL always passes through (x̄, ȳ).

15
New cards

Horizontal LSRL when r = 0

If r = 0, then b = 0 and the regression line is ŷ = ȳ (a horizontal line at the mean of y).

16
New cards

Slope interpretation

For each 1-unit increase in x, the predicted y changes by b units, on average, in the context of the model (with units).

17
New cards

Intercept interpretation

When x = 0, the predicted value of y is a; can be misleading if x = 0 is outside the observed x-range (extrapolation issue).

18
New cards

Coefficient of determination (r²)

The proportion of variability in y explained by the linear regression of y on x (e.g., r² = 0.64 means about 64% explained).

19
New cards

Residual plot

A graph of residuals versus x (or versus ŷ), showing points (x, e); used to assess whether a linear model is appropriate.

20
New cards

Random scatter around 0 (in a residual plot)

A desirable pattern indicating the linear model captures the main trend and leftover variation looks like random noise.

21
New cards

Nonlinearity (curvature)

A departure from linearity where residuals show a curved pattern (e.g., positive then negative then positive), suggesting a straight-line model is missing structure.

22
New cards

Nonconstant variance (changing spread)

A pattern where residuals “fan out” or “fan in,” indicating the variability of y changes across x and prediction reliability may vary by x.

23
New cards

Standard deviation of the residuals (s)

A measure of typical prediction error in y-units: s = sqrt(Σe²/(n−2)); describes typical distance of observed y values from the regression line.

24
New cards

High leverage point

A point with an extreme x-value compared to the rest of the data; can strongly pull the regression line because it is far out in the x-direction.

25
New cards

Influential point

A point that noticeably changes the regression line (slope and/or intercept) if removed; high leverage points are most likely to be influential, but not always.