Linear regression

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/15

flashcard set

Earn XP

Description and Tags

week 2

Last updated 4:06 PM on 3/13/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

16 Terms

1
New cards

linear regression

examining the association between at least 1 predictor variable and an outcome variable

Y = a+bX

  • X relates to Y in an additive, linear way

  • line of best fit

2
New cards

key components of linear regression

must be able to measure the predictor and outcome variables

  • eg. attendance over level of motivation

outcome should be continuous and measured on interval/ratio scale

3
New cards

linear vs multiple

linear = 1 predictor

multiple = 2+ predictors

4
New cards

why use multiple linear regression?

lets us acknowledge and statistically control for the contribution of other variables when examining our measure of interest

  • can look for relationship between predictor and outcome while aware of other factors potentially affecting it

allows us to know:

  • what predictors are associated with the outcome variable

  • to what extent they predict the outcome, while controlling for the other predictor variables

  • being able to predict scores on the outcome measure if scores on all predictors are known

5
New cards

line of best fit

Y= a+bX

  • Y = outcome/DV

  • a = intercept

  • b = slope

  • X = specific value on predictor/IV

regression won’t line up with each point exactly as it includes a lot of variability

  • an e is added for error of residuals

  • Y=a+bX+e

6
New cards

the intercept

a

value of outcome variable when predictor is 0

where the line of best fit intercepts the Y axis

7
New cards

slope

b

the rate of change in Y due to X

gradient of the line

8
New cards

straight line is drawn through the data so the sum of squared residuals is minimal

most regressions use an ordinary least squares approach

9
New cards

multiple linear regression - f-test

used to evaluate the overall significance of the regression model

compares our model against a baseline model

  • which assumes none of the predictors have an effect on the outcome

  • aka a model with just the intercept

tells us if the full model explains significantly more variance in the outcome than a model that explains nothing

  • should expect a decent model to outperform a model with no predictors

if the p-value for the test is <0.05, the overall model is significant

10
New cards

R2

statistic used for evaluating model fit in linear regression

tells us the proportion of variance in the DV, that is accounted for by the model

  • R2 of 0

    • model explains none of the variation in the DV

  • R2 of 1

    • model perfectly explains the variation in the DV

  • R2 of 0.7

    • 70% of variation in the DV is explained by the model

    • the other 30% is due to factors not captured by the model eg. error/unmeasured variables

11
New cards

adjusted R2

modified version that adjusts for the number of predictors in the model

takes number of IVs and sample size into account

  • contrasts to R2 which can artificially inflate as more variables are added

the adjusted value can’t be expressed as a percentage, whereas R2 can be

12
New cards

predictor variables

each one has a:

  • p-value - significance

  • estimate - relationship between the predictor and outcome

13
New cards

unstandardised coefficients

produced from lm() function

  • the original output from a regression analysis

expressed in the same units as the variables in the model

represents the actual change in the DV for each unit increase in the IV

  • eg. when predicting exam scores based on hours studied and attendance, the coefficient may indicate that for each hour studied, the final exam score increases by 0.5 points

14
New cards

standardised coefficients

expressed in terms of SDs, rather than original units

lets us compare the relative impact of different variables, even when they’re measured on different scales

indicate how many SDs the DV will change for a one SD change in the IV

  • holding other variables constant

highest one is the one with most influence

  • ignore the sign, just the number

15
New cards

to compute standardised coefficients

both IV and DV are standardised

  • transformed to have a mean of 0 and SD of 1

typically done by subtracting the mean of the variable from each value, then dividing by the SD

16
New cards

relationship between predictors and outcome variable

good ability to predict a specific value of Y, given the slope and intercept

accounting for error makes the estimate more accurate

  • so use the predict() function rather than manual calculation

Explore top notes

Explore top flashcards

flashcards
Vocab Lesson 12
48
Updated 1141d ago
0.0(0)
flashcards
WWW List 13
25
Updated 30d ago
0.0(0)
flashcards
Quarter 4 Religion : )
140
Updated 659d ago
0.0(0)
flashcards
Unit 5: Westward Migration
25
Updated 344d ago
0.0(0)
flashcards
DMU 3313 Kremkau
140
Updated 966d ago
0.0(0)
flashcards
biol114 - ch.9
54
Updated 373d ago
0.0(0)
flashcards
SPH3U1 - key definitions
191
Updated 1145d ago
0.0(0)
flashcards
APUSH Unit 1 Giddes Test
242
Updated 890d ago
0.0(0)
flashcards
Vocab Lesson 12
48
Updated 1141d ago
0.0(0)
flashcards
WWW List 13
25
Updated 30d ago
0.0(0)
flashcards
Quarter 4 Religion : )
140
Updated 659d ago
0.0(0)
flashcards
Unit 5: Westward Migration
25
Updated 344d ago
0.0(0)
flashcards
DMU 3313 Kremkau
140
Updated 966d ago
0.0(0)
flashcards
biol114 - ch.9
54
Updated 373d ago
0.0(0)
flashcards
SPH3U1 - key definitions
191
Updated 1145d ago
0.0(0)
flashcards
APUSH Unit 1 Giddes Test
242
Updated 890d ago
0.0(0)