Linear regression

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/15

Earn XP

Description and Tags

week 2

Last updated 4:06 PM on 3/13/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

16 Terms

New cards

linear regression

examining the association between at least 1 predictor variable and an outcome variable

Y = a+bX

X relates to Y in an additive, linear way
line of best fit

New cards

key components of linear regression

must be able to measure the predictor and outcome variables

eg. attendance over level of motivation

outcome should be continuous and measured on interval/ratio scale

New cards

linear vs multiple

linear = 1 predictor

multiple = 2+ predictors

New cards

why use multiple linear regression?

lets us acknowledge and statistically control for the contribution of other variables when examining our measure of interest

can look for relationship between predictor and outcome while aware of other factors potentially affecting it

allows us to know:

what predictors are associated with the outcome variable
to what extent they predict the outcome, while controlling for the other predictor variables
being able to predict scores on the outcome measure if scores on all predictors are known

New cards

line of best fit

Y= a+bX

Y = outcome/DV
a = intercept
b = slope
X = specific value on predictor/IV

regression won’t line up with each point exactly as it includes a lot of variability

an e is added for error of residuals

Y=a+bX+e

New cards

the intercept

value of outcome variable when predictor is 0

where the line of best fit intercepts the Y axis

New cards

slope

the rate of change in Y due to X

gradient of the line

New cards

straight line is drawn through the data so the sum of squared residuals is minimal

most regressions use an ordinary least squares approach

New cards

multiple linear regression - f-test

used to evaluate the overall significance of the regression model

compares our model against a baseline model

which assumes none of the predictors have an effect on the outcome
aka a model with just the intercept

tells us if the full model explains significantly more variance in the outcome than a model that explains nothing

should expect a decent model to outperform a model with no predictors

if the p-value for the test is <0.05, the overall model is significant

New cards

R²

statistic used for evaluating model fit in linear regression

tells us the proportion of variance in the DV, that is accounted for by the model

R² of 0
- model explains none of the variation in the DV
R² of 1
- model perfectly explains the variation in the DV
R² of 0.7
- 70% of variation in the DV is explained by the model
- the other 30% is due to factors not captured by the model eg. error/unmeasured variables

New cards

adjusted R²

modified version that adjusts for the number of predictors in the model

takes number of IVs and sample size into account

contrasts to R² which can artificially inflate as more variables are added

the adjusted value can’t be expressed as a percentage, whereas R²can be

New cards

predictor variables

each one has a:

p-value - significance
estimate - relationship between the predictor and outcome

New cards

unstandardised coefficients

produced from lm() function

the original output from a regression analysis

expressed in the same units as the variables in the model

represents the actual change in the DV for each unit increase in the IV

eg. when predicting exam scores based on hours studied and attendance, the coefficient may indicate that for each hour studied, the final exam score increases by 0.5 points

New cards

standardised coefficients

expressed in terms of SDs, rather than original units

lets us compare the relative impact of different variables, even when they’re measured on different scales

indicate how many SDs the DV will change for a one SD change in the IV