Regression

Regression

Regression is a statistical tool using one or more predictor variable to forecast scores on an outcome variable.

Univariate Regression

Statistical tool using one predictor variable to forecast scores on an outcome variable.

Ingredients

  • Data represented by an equation for a linear relationship.

  • Line created from info from a predictor variable and the outcome variable

  • With that equation, we can take any score from the predictor variable and find a predicted score on the outcome variable

Least Squares

The line with the least amount of total squared deviations from the line

  • Line best fits the data when it minimizes the error

  • The squared deviations between predictions and real observations

Imagine trying to match the data with a line, then looking at the amount of error

Univariate Regression Formula

Ŷ = b0 + b1X

Ŷ

  • Predicted value on outcome variable

b1

  • slope of the line

x

  • individual score on predictor variable

b0

  • y-intercept

Summary

Regression line

  • Straight line that best fits the datapoints in the scattersplot

  • Minimize error over the long run

  • This line represents a model predicting a score on a specific outcome variable

Regression line does not go through every data point

  • They could be above or below the line

  • Will always go through the y-intercept

The regression line is a model based on the data

Might not reflect reality

Need to test how well the model fits the observed data

Sum of Squares

Summary

SSt

  • Total variability

  • Variability between scores and the mean

SSr

  • Resdual/Error variability

  • Variability between the regression model and the actual data

SSm

  • Model variability

  • How much variability the model explains

Testing the Regression

If the model is strong then we except SSm to be much greater than SSr

Signal = MSm

  • Amount of variance in the outcome variable that is explained by the model

Noise = MSr

  • Difference between the model and the observed data

  • The proportion of variance accounted for by the regression model

    • Effect size

    • The correlation coefficient squared

•Formula: Ŷ = b0 + b1X

•Ŷ - predicted value on outcome variable

•b1 – slope of the line

• x – individual score on predictor variable

•b0 – y-intercept

• R²: Effect size; % of variability accounted for

• F: Significance of regression equation

• β: Standardized slope – indicates relationship

between predictor and outcome variable

What to know

Understand what the regression line is doing

  • The best fits line

How to use the regression line

  • what each of the numbers in the line refer to

  • To predict the value of y given an amount for x

  • To know what the value of y is when x is zero (the y-intercept)

Understand what R² indicated

Be able to interpret an APA style write up of a regression