Variable Selection in Multiple Regression

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/11

Earn XP

Description and Tags

These flashcards cover key concepts from the lecture on variable selection in multiple regression analysis.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

12 Terms

New cards

What does the multiple regression model equation yi = β0 + β1xi1 + β2xi2 + … + βkxi_k + 𝜖i represent?

It represents the relationship between a response variable and k explanatory variables.

New cards

What is overfitting in the context of regression models?

Including too many regressors that explain only a negligible portion of variability, leading to poor generalization to new data.

New cards

What is underfitting in regression analysis?

Including too few regressors, resulting in explaining only a portion of the variation in the response variable.

New cards

What is the goal of variable selection in modeling?

To consider all possible regressors and determine which subset provides the best model.

New cards

What criteria can be used to establish the 'best' model?

Large R² or Radj², small σ̂² (MSE), inclusion of significant regressors, small standard error of coefficients, logical regressors, and preference for simpler models.

New cards

What are some methods of model selection mentioned in the lecture?

Forward Selection, Backward Elimination, Stepwise, All Possible Subsets, Cross Validation, and LASSO.

New cards

What is Forward Selection in regression modeling?

A method that starts with no regressors, adding one at a time based on significance until no more can be added.

New cards

What is the significance level for entry used in Forward Selection referred to as?

α_entry.

New cards

What is Backward Elimination in regression analysis?

A method that starts with a full model and removes the least significant regressor based on a predefined significance level.

New cards

What do AIC and BIC stand for, and how are they used?

AIC is Akaike Information Criterion and BIC is Bayesian Information Criterion, both are metrics used for model selection by penalizing the log-likelihood according to the number of parameters.

New cards

What does a smaller value of AIC indicate?

A better model fit with respect to the degree of complexity.

New cards

What is RMSE and what does a smaller value suggest?

Root Mean Square Error; smaller values indicate a better model fit.