1/71
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What is an ensemble method?
A method that combines multiple models to produce a stronger overall model.
What are weak learners?
Simple models that perform moderately well on their own.
What is the intuition behind ensemble methods?
Combining diverse independent models improves prediction accuracy.
What is the “wisdom of the crowds” idea?
Aggregated predictions from many models are more accurate than individual predictions.
Why are decision trees commonly used in ensembles?
They have low bias but high variance and capture complex interactions.
What are the three main ensemble methods?
Bagging, random forests, and boosting.
Why do decision trees have high variance?
Small changes in data can produce very different trees.
What is the key idea behind bagging?
Reduce variance by averaging many independently trained models.
What does bagging stand for?
Bootstrap aggregation.
What is the bootstrap?
A resampling method that samples observations with replacement.
Why is bootstrap useful?
It approximates sampling from the population using one dataset.
How are bootstrap samples constructed?
By randomly sampling observations with replacement from the dataset.
Why can observations repeat in bootstrap samples?
Sampling is done with replacement.
What is the purpose of bootstrap in bagging?
To generate multiple training datasets.
What happens after generating bootstrap samples in bagging?
A model is trained on each sample.
How are predictions combined in bagging for regression?
By averaging predictions.
How are predictions combined in bagging for classification?
By majority voting.
Why does averaging reduce variance?
Averaging independent estimates reduces variability.
What type of trees are used in bagging?
Deep, unpruned trees.
Why use deep trees in bagging?
They have low bias but high variance, which averaging reduces.
What is the main benefit of bagging?
Variance reduction without increasing bias.
What is a limitation of bagging?
Loss of interpretability due to many trees.
What is out-of-bag (OOB) data?
Observations not included in a bootstrap sample.
What proportion of data is typically OOB?
About one-third of observations.
How is OOB error estimated?
By predicting each observation using trees where it was not included.
Why is OOB error useful?
It provides validation without a separate test set.
What is the relationship between OOB error and cross-validation?
OOB approximates leave-one-out cross-validation.
What is variable importance in bagging?
A measure of how much each variable reduces prediction error.
How is variable importance computed?
By averaging reduction in error across trees.
What does a high variable importance indicate?
The predictor strongly influences predictions.
Why does bagging reduce interpretability?
Because results come from many aggregated trees.
What is the limitation of bagging regarding correlation?
Trees can be highly correlated if strong predictors dominate.
Why is correlation between trees a problem?
It reduces the effectiveness of variance reduction.
What is a random forest?
An extension of bagging that decorrelates trees.
How do random forests reduce correlation?
By selecting a random subset of predictors at each split.
What is the parameter m in random forests?
Number of predictors considered at each split.
What is a typical choice for m?
Square root of total predictors.
What happens if m equals total predictors?
Random forest becomes equivalent to bagging.
What is the key advantage of random forests over bagging?
Lower correlation between trees leading to better variance reduction.
Does each tree in random forests use all predictors?
Yes, but only a subset is considered at each split.
How does randomness improve performance in random forests?
It produces more diverse trees.
What happens to bias in random forests?
Slight increase compared to bagging.
What happens to variance in random forests?
Reduced compared to bagging.
What is boosting?
An ensemble method that builds models sequentially.
How does boosting differ from bagging?
Models are built sequentially instead of independently.
What type of trees are used in boosting?
Shallow trees.
Why are shallow trees used in boosting?
To control complexity and reduce overfitting.
What is the key idea of boosting?
Learn from previous errors and improve predictions iteratively.
What does boosting focus on at each step?
Residual errors from previous models.
What are residuals in boosting?
Differences between observed and predicted values.
How does boosting update predictions?
By adding new models that correct previous errors.
What is the shrinkage parameter?
A parameter controlling learning rate in boosting.
Why is a small shrinkage parameter used?
To ensure slow and stable learning.
What happens if shrinkage is too large?
Model may overfit quickly.
What happens if shrinkage is small?
Requires more trees but improves generalization.
What is the number of trees in boosting?
Number of sequential models added.
Can boosting overfit?
Yes, if too many trees are used.
What is the depth parameter in boosting?
Controls complexity of individual trees.
What does depth represent?
Maximum number of splits per tree.
What is the effect of depth on model?
Higher depth increases interaction complexity.
What is gradient boosting?
A boosting method that fits trees to gradients of the loss function.
What are pseudo-residuals?
Values representing model errors used as targets.
What is the role of loss function in boosting?
Defines what errors are minimized.
How does boosting reduce bias?
By iteratively improving model predictions.
What is stochastic gradient boosting?
A variant that uses random subsamples of data.
Why use subsampling in boosting?
To reduce variance and improve efficiency.
What is a typical subsample size?
About half of the data.
What is XGBoost?
An advanced implementation of gradient boosting.
Why is XGBoost popular?
It is efficient and performs well in practice.
Bias-variance tradeoff in bagging?
Reduces variance with little change in bias.
Bias-variance tradeoff in random forests?
Further reduces variance with slight increase in bias.
Bias-variance tradeoff in boosting?
Primarily reduces bias but can increase variance if overfit.