1/40
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
What are the specific traits of forward selection?
1. less complex
2. better with large data sets
3. higher risk
4. lower computational cost
Why is a model with too many predictors not useful?
1. Predictors can be correlated
2. Lead to overfitting in the model
What is the goal of subset selection?
To find a simple model that performs sufficiently well
(True/False) Do you only subset and assess predictive accuracy on test data?
True
What are the traits of an exhaustive search?
1. All predictors are assessed
2. Computationally intensive for big sets of data
3. Does give you the best subset to use
(True/False) The best use of sub setting will balance model fit and complexity.
True
What is the adjusted R^2?
Known as the coefficient of determination, and adjusts the R^2 statistic for the number of predictors (higher the better)
What is Akike Information Criterion? (AIC)
Balances model fit and complexity (lower the better)
What is the Bayesian Information Criterion? (BIC)
Imposes a larger penalty for models with more predictors (strict) (lower the better)
What is forward selection?
Starts with no predictors, adds them one by one, and stops when the addition doesn't improve the performance
What is backward elimination?
Starts with all predictors, eliminates the least useful ones one by one, and stops when all remaining predictors are significant
What is stepwise selection?
Like forward selection, but at each step considers dropping non-significant predictors
What are the specific traits of backward elimination?
1. more complex
2. worse with large data sets
3. lower risk
4. higher computational cost
What are the specific traits of stepwise selection?
1. Intermediate complexity
2. better with large data sets
3. balances risk
4. moderate to high computational cost
What does the R function regsubsets() work with?
Work with numerical quantitative variables
What does the R function stepAIC() work with?
Work with categorical outcome variables
When do you use PCAs?
When you want to reduce the number of features while retaining the most information
How is Principal Component Analysis measured?
By the sum of the variances of the variables (weighted averages of the original variables)
What do PCAs create?
Create new variables that are linear combinations of the original variables
In a PCA, are the linear combinations dependent or independent?
Independent of another
Can PCs be used for categorical variables?
No, only quantitative variables
How does PCs rank variance?
First explains the most variance, next explains the rest, and so on, while all standing alone
What are loadings?
The weight that tells how much each original variable contributes to a PC
What is the magnitude of the loadings?
The variable's influence on the PC (the stronger the influence, the bigger the magnitude)
What is a sign of a loading?
Shows whether a variable moves with a PC or against it
Which of the following is true regarding the retained variables from subsetting?
Have a clearer meaning
Which of the following is true regarding the retained variables from PCAs?
Not directly interpretable
Adjusted R² should be high or low to achieve a good performing model?
High
How does variable selection differ from Principal Component Analysis (PCA)?
Variable selection identifies and retains the most relevant variables whereas PCAs reduce dimensionality while retaining most of the variance
What approach do we use for variable selection?
Select a subset of original variables based on critera
What approach do we use for PCAs?
Transform original variables into new uncorrelated variables (PCs)
How can we describe the nature of variables in variable selection?
Original features are retained
How can we describe the nature of variables in PCAs?
New variables are created from linear combinations of original variables
How can we describe the interpretability in variable selection?
Retained variables have clear meaning and context
How can we describe the interpretability in PCAs?
PCs are not directly interpretable
How can we describe the data structure in variable selection?
The data structure remains the same with fewer variables
How can we describe the data structure in PCAs?
Data structure is altered with new PCs
What is an example of variable selection?
Selecting 10 relevant variables from 100
What is an example of PCAs?
Transforming 100 variables into only 10 PCs