05 Bagging
Bagging (Bootstrap Aggregating)
Definition: Bagging, also known as bagged trees or bootstrap aggregation, is an ensemble method in machine learning.
Functionality: Involves growing multiple decision trees on variations of the same dataset and aggregating their predictions to improve accuracy and robustness.
reduces variance by averaging predictions → more stable and reliable model
Principles Behind Bagging
Why?
single trees are sensitive to changes in input → high variance
Ensemble Approach: averaging over multiple model mitigates this
General Applicability:
Although commonly applied to decision trees, bagging can be employed with different machine learning models.
Process of Bagging
Take B diffrent training sets (bootstrapping)
Grow separate unpruned tree on each training set
Average the resulting predictions
Bootstrapping Explained
Out of Bag (OOB) Observations:
On average, each tree uses only 2/3 of the original training data. The unused portion becomes the out of bag data for predictions.
OOB Prediction:
When predicting a new observation, only the trees that did not use that observation during training are used for predictions.
Provides a method for evaluating model performance without needing a separate validation set.
OOB error estimates the test error of the bagged model.
Advantages and Disadvantages
Advantages of Bagging:
Reduces the chances of overfitting.
Improves accuracy compared to single decision trees.
Disadvantages of Bagging:
The ensemble model becomes less interpretable; it is challenging to convey insights from multiple trees to non-technical stakeholders.
Variable Importance in Bagging
Importance Evaluation:
Analyze the contribution of each predictor in the ensemble by tracking how much the predictor reduces the error (e.g., RSS or Gini index).
Aggregate contributions across all trees to compute the average importance of each predictor.
The larger the improvement in predictions due to a predictor, the more important it is considered.
Conclusion
Bagging effectively reduces variance and increases prediction accuracy in numerous applications, especially with decision trees, while providing flexibility in model selection through the general method of ensemble learning.