1/26
Flashcards to review the concept of R-squared, explained/unexplained variation, and the worked example from the notes.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What does R-squared measure in linear regression?
The percentage of variability of y that is explained by the model.
R-squared can be written as 1 minus the ratio of what two quantities?
Unexplained variation to total variation (SSE/SST).
What is the formula for the unexplained variation (SSE) in the notes?
SSE = (1/n) sum (y - y_hat)^2.
Why is the average of prediction errors not included in SSE?
Because the average prediction error is zero.
How is the total variation of y expressed?
As the variance of y: (1/n) sum (y - y_mean)^2.
What are the two variance conventions mentioned, and when do they apply?
1/n vs 1/(n-1); 1/n is population variance, 1/(n-1) is sample variance.
Do 1/n and 1/(n-1) affect the value of R-squared?
No; they cancel between numerator and denominator.
What is the x-values in the example?
x = 1, 2, 3, 4.
What are the corresponding y-values in the example?
y = 5, 6, 6, 9.
What is the regression model used in the example?
y_hat = 3.5 + 1.2 x.
What is SSE in the example?
SSE = 1.8.
How is the mean of y computed in the example?
Mean y = (5+6+6+9)/4 = 6.5.
What is SST in the example?
SST = 9.
How is R-squared calculated in the example?
R^2 = 1 - SSE/SST = 0.8.
What is the numerical value of R-squared in the example?
0.8 (or 80%).
What does an R-squared of 0.8 mean?
80% of the variance in y is explained by the model.
What does SSE measure conceptually?
Unexplained variation; the squared prediction errors.
What does SST measure conceptually?
Total variation of y around its mean.
What is y_hat?
The predicted value of y from the regression model.
How do you compute the predicted values for x=1..4 in the example?
Plug into y_hat = 3.5 + 1.2 x to get 4.7, 5.9, 7.1, 8.3.
What is the broader interpretation of R-squared?
The proportion of total variability in y explained by the model.
How does the example illustrate the R-squared comparison?
It compares how close y is to yhat versus to ymean.
What is the mean of y in the dataset?
6.5.
What is the predicted value when x equals 1 in the example?
4.7.
How many data points are used in the example?
Four.
What does an 80% explanation imply about model fit?
The model explains most but not all of the variability.
In summary, how are SSE and SST related to R-squared?
R^2 = 1 - SSE/SST; decreasing SSE relative to SST increases R^2 toward 1.