1/26
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
correlational research
allows us to establish whether an association exists, but does not allow us to establish whether the association is causal.
can explore things that cannot ethically be looked at in experiments (e.g., drug addiction)
can explore things that cannot feasibly be looked at in experiments (due to small effects)
can explore things that cannot be put into a classic experimental framework (e.g., age)
Uses for correlational research
Exploring big data
Questionnaires/surveys
Secondary data analysis
Understanding the multivariate world
Predictions
Positive association
as X increases Y increases
Negative association
as X increases Y decreases
strong association
measurements near line of best fit (estimate accurate)
Weak association
lots of variance around line (not accurate)
Aims of a regression model
Whether our model is a ‘good fit’- good at making predictions
Whether there are significant relationships between a predictor variable(s) and an outcome variable
The direction of these relationships
used to make predictions
A regression
identifies and quantifies the relationship between a dependent variable (the outcome) and one or more independent variables (predictors) to make predictions and understand patterns.
line of best fit equation
Y= bX + a
SSR
Difference from overall mean and predicted value (variance we can explain)
SSE
How far from perfect is our line of best fit (variance we cannot explain)
we want SSR to be__than SSE
bigger
SST
The total amount of variance/ coefficient of determination
what can R2 range from
0-1
role of Adjusted R2
Decreases a lot of variables with little value
Should be equal or lower than R2
Linear regression
To understand individual predictors
To understand the direction of the associations
regression coefficients
The number of units the DV changes for each one unit increase in the IV
usually interpreted alongside standard error (SE should be smaller)
Standardised regression coefficients
Beta (β) values commonly reported
Explain the association between each IV and DV in terms of standard deviation changes
Allows simple comparison of the strength of the association between your IVs and DV
The higher the beta the stronger the association is.
Assumptions of simple & multiple regression
Normally distributed (ish) data
Independent Data
Interval/ratio predictors (continuous)
Nominal predictors with two categories (dichotomous)
No multicollinearity for multiple regression
Careful of influencing cases
what is not allowed in a regression and why
multicategorical predictors- coding is arbitrary so will give different results depending on how they were coded
simple regression write up
A simple regression was carried out to investigate the relationship between self-control and BMI.
The regression model was significant/not significant and predicted approx. _% of variance
(adjusted R2__ (F-stat)=__, p__). ____ was a significant negative/positive predictor of ___ (b=__(se=__), p<__; 95% CI to __).
Multicollinearity and effects
occurs when independent variables in a regression model are highly correlated.
e.g., height & weight
Means they do not provide independent information
Can adversely affect regression estimates (b and se)
Large amount of variance explained but no significant predictors
Identifying multicollinearity
run a correlation
Look for high correlations between variables in a correlation matrix (rule of thumb r > .80)
r = 1.0 is perfect multi-collinearity and likely represents a data issue.
Tolerance statistic
Percentage of variance in the IV accounted for by other IVs
1 – R2
High tolerance = low multicollinearity
Low tolerance = high multicollinearity (a value of < .20 or .10)
Variance Inflation Factor
Inverse of tolerance
1/tolerance
Indicates of much the standard error will be inflated
VIF over 4 suggests multicollinearity with values above 10 suggests there to be substantial multicollinearity.
How to solve multicollinearity issues
Increase sample size – this will stabilise the regression coefficients
Remove redundant variable(s)
If the two or more variables are important, create a variable that takes both of them into account
Multiple regression write up
A multiple regression was conducted to investigate the role of ___, ___, ___ on __. The regression model was significant/not significant and predicted 48% of
variance (adjusted R2: F (_,_)=___,p__). ___ was a significant negative/positive predictor of __ (b= (s.e.= ), p; 95% CI to ) continue with other factors
Variance inflation factors suggest multicollinearity was/was not a concern (do each factor).