Regression line
a model that models how a response variable y changes as an explanatory variable x changes. Regression lines are express in the form y hat = a + bx where y hat is the predicted value of y for a given value of x
Extrapolation
the use of a regression line for prediction outside the interval of x-values used to obtain the line. The further we extrapolate, the less reliable the predictions
Residual
the difference between the actual value of y and the value of y predicted by the regression line; residual = actual y - predicted y
y-intercept
a is the y-intercept. the predicted value of y when x=0
slope
b is the slop, the amount by which the predicted value of y changes when x increases by 1 unit
Least-squares regression line
is the line that makes the sum of the squared residuals as small as possible
Residual plot
a scatterplot that displays the residuals on the vertical axis and the explanatory variable on the horizontal axis
Standard deviation of the residuals s
measures the size of a typical residual. S measures the typical distance between the actual y values and the predicted y values
Coefficient of determination r²
measures the percent reduction in the sum of squared residuals when using the least-squares regression line to make predictions, rather than the mean value of y; measures the response variable that is accounted for by the least-squares regression line
High Leverage
points with this in regression have much larger or much smaller x-values than the other points in the data set
Outlier
in regression is a point that does not follow the pattern of the data and has a large residual
Influential point
in regression is any point that, if removed, substantially changes the slope, y-intercept, correlation, coefficient of determination, or standard deviation of the residuals