1/21
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
for multiple regression we look at
multiple predictor variables
What can multiple regression tell us..
tells us the relative importance of the predictor variables and if the outcome variable is best predicted by a combination of predictor variables
in multiple regression we can control for variables when
testing the predictive power of other variables i.e, we can get a better picture of the independent contribution of our predictors on the outcome
The general formula for the predictive equation for multiple regression
y = b0 + b1 X1 + b2 X +
the multiple regression predicts
outcome variable from multiple predictor variables
determines the degree of influence (i.e weight) each predictor variable has in determining the outcome variable
in a multiple regression the regression weights are partial…
taking into account the relation of each predictor variable with all other predictor variables
in a multiple regression we use intercept and partial regression weights to build
a model
multiple regression answers 2 main questions
how good is the overall model e.g. can the predictior variables predict the outcome variable - look at variance explained and fit of the model
how good is each individual predictor e.g. can each predictor variable, individually, predict the outcome variable? - regression weight and sig level
variance explained and fit of the model - explains how good the model is - variance explained
variance explained - R squared
how well the regression line approximates the actual data points
it increases every time you add a new predictor variable into the regression model, even when the newly added predictor variable does not really add predictive value - this is a problem in MR - instead we use adjusted R squared
works for simple linear regression
variance explained and fit of the model - explains how good the model is - variance explained - adjusted R squared
only for multiple regression
adjust for the number of predicted variables in multiple regression
variance explained and fit of the model - explains how good the model is - fit of the model
F and associated P value, whether the model explains a sig amount of variance
how good is each individual predictor? - we can look at regression weights
focus on the unstandardised regression weight (b)
slope - change in the outcome variable for one-unit change in the predictor variable
standardised regression weight (beta) - when the predictor and the outcome variables are measured in standard scores (e.g. Z scores) - useful for assessing relative importance of predictor variables by putting all variables on the same scale
how good is each individual predictor? - we can look at sig levels
whether each predictor variable is a sig predictor of the outcome variable
Multiple regression parametric assumptions - assumptions before data collection (design)
all observations are independent
outcome variable is interval or ratio
Multiple regression parametric assumptions - assumptions after data collection
linearity
minimal outliers
normal distribution of residuals
homoscedascicty
no multicollinearity
Multiple regression parametric assumptions - no multicollinearity
little shared variance in predictors and outcome = no M
if there is overlap between preceptors = shared variance = M
e.g. when the preceptor variables are highly correlated with each other
Multiple regression parametric assumptions - no multicollinearity - why is this important
can lead to unreliable coefficient estimates - difficult to work out the individual effects of each preceptor variable - leads to large standard errors for the coefficients, making them unstable and sensitive to small changes in the model
inflated variaance - variance of coefficients increases and reduced the precision of estimates - leads to unreliable p values of coefficients
Multiple regression parametric assumptions - no multicollinearity - 3 ways to detect
correlation coefficients between our predictor variables - if larger than 0.7
variance inflation factor - how much variance of the regression weight is inflated because of M - needs to be less than 10
tolerance - how much unique variance is explained by a single predictor variable - should be high, should be above .20
if linearity assumption has been violated
use non-linear regression
if minimal outliers assumption has been violated
if 5% or more of the sample are identified as outliers, then you may want to exclude them from the analysis
normal distribution of residuals/ homoscedacity assumption
the removal of outliers often also resolves this
how to deal with the violated of no multicollineraity
include only one of the multi collinear predictor variables in the analysis or combine the variables