Regression
Regression
Regression is a statistical tool using one or more predictor variable to forecast scores on an outcome variable.
Univariate Regression
Statistical tool using one predictor variable to forecast scores on an outcome variable.
Ingredients
Data represented by an equation for a linear relationship.
Line created from info from a predictor variable and the outcome variable
With that equation, we can take any score from the predictor variable and find a predicted score on the outcome variable
Least Squares
The line with the least amount of total squared deviations from the line
Line best fits the data when it minimizes the error
The squared deviations between predictions and real observations
Imagine trying to match the data with a line, then looking at the amount of error
Univariate Regression Formula
Ŷ = b0 + b1X
Ŷ
Predicted value on outcome variable
b1
slope of the line
x
individual score on predictor variable
b0
y-intercept
Summary
Regression line
Straight line that best fits the datapoints in the scattersplot
Minimize error over the long run
This line represents a model predicting a score on a specific outcome variable
Regression line does not go through every data point
They could be above or below the line
Will always go through the y-intercept
The regression line is a model based on the data
Might not reflect reality
Need to test how well the model fits the observed data
Sum of Squares
Summary
SSt
Total variability
Variability between scores and the mean
SSr
Resdual/Error variability
Variability between the regression model and the actual data
SSm
Model variability
How much variability the model explains
Testing the Regression
If the model is strong then we except SSm to be much greater than SSr
Signal = MSm
Amount of variance in the outcome variable that is explained by the model
Noise = MSr
Difference between the model and the observed data
R²
The proportion of variance accounted for by the regression model
Effect size
The correlation coefficient squared
•Formula: Ŷ = b0 + b1X
•Ŷ - predicted value on outcome variable
•b1 – slope of the line
• x – individual score on predictor variable
•b0 – y-intercept
• R²: Effect size; % of variability accounted for
• F: Significance of regression equation
• β: Standardized slope – indicates relationship
between predictor and outcome variable
What to know
Understand what the regression line is doing
The best fits line
How to use the regression line
what each of the numbers in the line refer to
To predict the value of y given an amount for x
To know what the value of y is when x is zero (the y-intercept)
Understand what R² indicated
Be able to interpret an APA style write up of a regression