Linear regression

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/32

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

33 Terms

New cards

the regression model

Y=a + b₁X₁ + b₂X₂ + … + b_nX_n + e

a - constant

b_n - regression coefficients (they represent the change in the dependent variable for a one-unit change in the corresponding independent variable, holding other variables constant)

New cards

R²

the proportion of variance in the dependent variable explained by the model

R²=SSR/SST

New cards

multiple r

quantifies the strength and direction of the linear relationship

New cards

Factors determining the SE of the model’s coefficients

worse model fit → larger residuals + standard errors
too many variables or to few cases collected
large collinearity → larger standard errors

New cards

worse model fit

the model does not fit the data well (R² is low), this can result in larger residuals and, consequently, larger standard errors for the coefficient estimates.

New cards

collecting more cases relative to the number of predictors

can help improve the standard errors of the coefficient estimates

New cards

high collinearity

makes it challenging to isolate the individual effects of each predictor

New cards

standardized regression coefficients (beta coefficients/weights)

obtained when both the dependent and independent variables are standardized

→ larger the absolute value of the standardized coefficient, the more impact the corresponding predictor has

Beta = b*[(SD X)/(SD Y)]
Y= a + b₁X₁+ b₂X₂ + b_kX_k
Z (Y) = beta₁Z₁+ beta₂Z₂+b beta_kZ_k

New cards

Collinearity

refers to high correlations between independent variables

inflates standard error
leads to multicollinearity issues

Solutions:

removing one or more of the highly correlated variables
combining highly correlated variables into a composite variable

New cards

Beta Weights

are useful and interpretable if the independent variables are noncollinear

depend on the method used to include the predictors in the equation
the size of the beta weights depends strictly on the variables in the equation
beta weights cannot be compared between different studies because they are very sensitive to sample statistics

New cards

Variance Inflation Factor

measures how much the variance of an estimated regression coefficient increases if your predictors are correlated

>4 - collinearity
>10 severe collinearity

New cards

Tolerance

~ 1 - little collinearity
~ 0 - high collinearity
< 0,25 - multicollinearity might exist

New cards

solutions for collinearity

removing one or more of the highly correlated variables
combining highly correlated variables

New cards

model building methods

○ Standard Method (jamovi)

○ Backward Method

○ Forward Method

○ Stepwise Method (SPSS)

○ All Possible Sub-Sets Method

○ Sequential/Hierarchical Regression (optional in jamovi)

○ Blockwise/Factor Scores Method

New cards

not collinear predictors

there is no difference in the results obtained with the different methods

New cards

standard method (forced entry)

including all available predictors in a regression model without considering their correlation with the criterion variable or with each other

New cards

standard method - advantages

no subjectivity (about excluding or including predictors)
comparability across samples ((the same set of predictors are used in every analysis)

New cards

standard method - disadvantages

collinearity (if there is a high correlation among predictors, it can lead to inflated standard errors → challenging to identify the individual contribution of each predictor)

inclusion of irrelevant predictors
reduced statistical power (presence of non-informative/collinear predictors)

New cards

sequential/hierarchical method

systematic approach of entering predictor variables based on predefined criteria

→ the data analyst uses prior knowledge, domain expertise, or conceptual analyses to determine which predictors should be included in the regression model, and in which order

ΔR²
incremental F-statistics

New cards

ΔR²= (R²_k+1- R²_k)

quantifies the improvement in the goodness of fit of a regression model when an additional predictor or group of predictors is added

assesses increase in the explained variance in the dependent variable due to the inclusion of new variables

New cards

incremental F-statistic

assess the significance of adding a predictor or group
determine whether the inclusion of a set of predictors improves the overall fit of the model in a statistically significant way

New cards

Incorporating categorical predictors into regression analysis

dummy coding (binary) - nominal categories
ordinal coding - assigning numerical values to predictors based on order

New cards

the constant in dummy coding

the mean of the reference category

New cards

polynominal regression

allows for a more flexible relationship by introducing polynomial terms

Y = a + b1X + b2X2 + b3X3+ ... bnXn + e

common in developmental psychology

New cards

choosing the appropriate degree of polynomial regression

explanatory data analysis (scatter plots, correlation analysis, other visualizations)
deciding on a reasonable range of powers
start with simplest model → add higher-order terms
evaluate its performance
- to assess whether the addition of a higher-order power significantly increases the R-squared value - use statistical tests
until a significant improvement in R-squared is no longer observed

recommendation: centering the X

New cards

centering X

prevents collinearity

X → X - Mean

new variability has a mean of 0

New cards

assumptions of regression analysis

linearity of the relationship
homogeneity of variances
no outliers
normal distribution
prediction errors are independent and distributed randomly
no multicollinearity

New cards

high correlations between variables in regression analysis

lead to unstable coefficient estimates and inflated standard errors

New cards

residuals

the differences between the observed values and the values predicted by the regression model

→ key role in checking assumptions of regression analysis

New cards

linearity assumption

to support linearity, we anticipate having 50% positive and 50% negative residuals at each level of the fitted values

New cards

homogenity assumption

if the pattern of residuals form a rectangular shape (in linear relationship)

New cards

normality assumption

histogram
Q-Q plot
Shapiro-Wilks

deviations: possible outliers or transformations of variables

New cards

independence assumption

violated: systematic pattern in the residuals

analysing the autocorrelation function of residuals
Durbin-Watson Test