1/28
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
cook’s distance
a single point’s “pull”
covariance
extent to which a person’s deviation scores vary (positive=high on both, negative=low on both, 0=no covariance) (do they match?)
correlation [r]
a standardized measure (covariance) of the strength of an association, also the average of (z-scores x*z-scores y)
Spearman’s Rho (p)
power, finding the correlation of non-linear relationships, tests if your data is statistically significant, if p<0.05 then it is!
Linear regression model
y^=b0+b1Xi (predicted y= intercept+slopeX)
Quadratic regression model
y=b0+b1Xi², where b1=the shape of the curve
E
residual error, y-y^
Sum of Squared Errors (SSE)
Defines the spread, (sum(Yi-Y^i)²), sum of residuals squared
Mean Square Error
The average amount of spread, same equation as SSE, just averaged; you want to find the lowest MSE
Least Squares Estimation
method for finding the smallest MSE
Brute Force Method
Start with a possible value for b1, find residuals and compute MSE, adjust b1 and repeat until you find lowest MSE
Analytic method
first find b1 using (b1=rx,y(SDy/SDx)), then plug that into this to find b0: (b0=My-b1Mx)
Z-scored regression
Will always have a y-int of 0 and a slope=correlation, so b1=rx,y and b0=0, Use this to find the lowest MSE: (Zy^=rx,yZxi) (predicted mean z-score of y=correlation*mean z-score of X
Mean centering
subtract the mean amount of x from every amount of x (Mx-x); this causes b0 to become the mean, making it directly interpretable
R²
A scaled index of model fit, 1-MSE/Var(y)=R²; when the model is perfect, r²=1, and when it does no better it is 0; ALSO correlation squared
Comparing models
compare r²’s of different models, the one closest to 1 is the best fit, allows you to predict y from multiple values of x
Model Errors
E-measurement error and variables not in the model, the residual after accounting for the model
Q-Q plot
Plot z-scored residuals against z-scores from a standard normal distribution. Should fall in a straight line with a slope of 1.
homoscedacity
The variance of the residuals should be about equal for all Xi values.
Linear check
predicted values (y^) should be equally likely to be bigger or smaller than observed values of x
overfitting
models with more predictors are likely explaining what’s actually just noise, rather than something actually systematic; not explaining x and y in the real world.
Adjusted R²
a modified version of r², higher for a more complex model only if the additional parameters improve fit more than would be expected by chance; if adjust r² > r², then the added parameters are not adding value
interpolation
estimating y^ values for unmeasured xi values that are within the range of the data
extrapolation
estimating y^ values for unmeasured xi values that are out of the range of the data, limited because most variables cannot infinitely exponentially increase/decrease
RMSE
root MSE, square to get MSE
SSE
sum of squared errors, sum of error squared
SSR
variance is an average of SSR, measures the distance between the predicted values on your regression line and the mean (average) of the data. Sum of predicted deviations squared; SUM(Yi^-My)²
“Standardized” on Table
standardized b1