Multiple Regression

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/39

There's no tags or description

Looks like no tags are added yet.

Last updated 3:29 PM on 12/7/25

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

40 Terms

New cards

Multiple Regression Formula

y = b₀ + b₁x₁+ b₂x₂ + error

New cards

MR Formula — y =

outcome

New cards

MR Formula — e =

error/residuals

New cards

MR Formula — x₁ =

predictor 1

New cards

MR Formula — x₂ =

predictor 2

New cards

MR Formula — b₁ (and b₂) =

partial regression coefficient

New cards

MR Formula — b₀ =

y-intercept
value of y when x is 0

New cards

The Best Fitting Line

intersection of plane with y-axis = intercept (b₀)
slope of the plane with respect to x₁ defines b₁
slope of the plane with respect to x₂ defines b₂

New cards

Use of Residual Plots

graph residuals against predicted values
if the assumptions are tenable, then the residuals should scatter randomly about a horizontal line of 0
any systematic pattern or clustering of residuals suggest violations of assumption

New cards

Hypotheses

Overall Model
Specific Predictors
Comparison Among Predictors

New cards

Hypotheses — Overall Model

testing all predictor variables:
- examines whether a model including all of the predictor variables is better than the model with none of the predictor variables
  - y = b₀ + b₁x₁+ b₂x₂ + e VS y = b₀ + error
the F reported in MR tests this hypothesis
- H₀: r² = 0

New cards

r²

proportion of variance accounted for by the specified model

New cards

a multiple correlation coefficient and is the correlation between the observed values of y and the values of y predicted by the models (large values represent a large correlation between observed and predicted values)
- if R = 1: the model perfectly predicts the observed data (gauge of how well the model predicts)

New cards

R²

represents the amount of variation in the outcome variable (y) that is accounted for by the model

New cards

Adjusted R²

tells us the amount of variation in the outcome variable that would be accounted for if the model had been derived from the population from which the sample was taken
considers the number of predictors in the model and penalizes excessive variables, providing a more accurate measure of the model’s goodness of fit, especially with multiple predictors
- adjusted R² gives a more conservative estimate — more accurate estimate

New cards

Hypotheses — Specific Predictors

testing of individual predictor variables
- examines whether the inclusion of each predictor variable improves prediction
  - H₀: b₁ = 0
  - H₀: b₂ = 0

New cards

Partial Regression Coefficient

can be thought of as the predicted units of change in y for each unit of change in the independent variable when the value for all other independent variables in the model are held constant
- b₁ = .50 and b₂ = 2.00
  - holding x₁, there is on average a 2.00 point increase in the outcome for every 1 unit increase in x₂

New cards

Confidence Intervals

for any partial regression coefficient (slope) we can calculate a confidence interval
the CI for a regression coefficient is calculated and interpreted in the same way as it is in simple linear regression
the interpretation is that we’re 95% confident that the population regression line will fall within that range
when a value of 0 falls within the range, the statistical test will not be significant (fail to reject the null)

New cards

Hypotheses — Comparison Among Predictors

partial standardized regression coefficient: the partial regression coefficients that are obtained after IV and DV have been standardized allow for comparison among predictors
- H₀: b₁ = b₂
- H₀: b₂ = b₃
- H₀: b₁ = b₃

New cards

Assumptions of Multiple Regression

independence of residuals
assumptions of linearity
homoschedasticity
normal distribution: multivariate normality

New cards

Assumptions of Multiple Regression — Independence of Residuals

errors are independent of each other
to test:
- plot residuals and look for patterns

New cards

Assumptions of Multiple Regression — Linearity

linear relationship between predictors and outcome (no linear relationship between predictors)
to test:
- check correlations (predictors individually with the outcome)
- create scatterplot to compare IV and DV
- residual plot

New cards

Assumptions of Multiple Regression — Homoschedasticity

variance of errors should be similar across values
to test:
- plot data (cone shape indicates an issue)

New cards

Assumptions of Multiple Regression — Multivariate Normality

each variable is normally distributed on its own and still normally distributed when you bring them all together

New cards

Issues in Multiple Regression

number of predictors
multicolinearity
outliers and influential cases
sample size

New cards

Issues in Multiple Regression — Number of Predictors

too many predictors can increase chances of multicollinearity, a bunch of noise, and can make it difficult to determine what is predicting what
overfitting
too many predictors can lead to an inflated R²

New cards

Overfitting

if you throw a bunch of stuff in, you will find something that will stick but you might just be fitting the noise

New cards

Issues in Multiple Regression — Multicollinearity

redundancy among predictors
when 2 predictors are 100% related they have perfect collinearity (makes it impossible to get an accurate measure of variance in the outcome being caused by the predictor
can be tested using VIF or tolerance

New cards

Testing Collinearity

if
- VIF > 10
- Tolerance < 0.1
then there is an issue with multicollinearity

New cards

Issues in Multiple Regression — Outliers and Influential Cases

can pull the regression line

New cards

Issues in Multiple Regression — Sample Size

if sample size is too small you are making it more difficult to find an effect even if it is there
- if you do manage to find an effect it might be hard to generalize to a larger population

New cards

Determining Ideal Sample Size

overall fit: 50 + 8(k)
individual fit: 104 + (k)
- k = number of predictors

New cards

Categorical Variables in Multiple Regression

where no meaningful order can be defined for the levels, different levels do not reflect equal distances between one another
- e.g., religion, ethnicity, gender, academic major

New cards

Representing Categorical Variables

use dummy coding

New cards

Categorical Variables with only 2 groups/levels in Simple Regression

the constant = mean of y for the group designated as 0
the regression coefficient = difference between the 2 means

New cards

Categorical Variables

any categorical variable can be included in a regression analysis
if a categorical variable has more than 2 levels, the number of predictor variables needed will always be:
- # of groups - 1

New cards

Methods of Multiple Regression