12: Decision trees

studied byStudied by 0 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 4

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

5 Terms

1

What are advantages of decision trees?

Trees are very flexible and can accommodate different types of responses (quantitative as well as qualitative) resulting in regression respectively classification trees.

Also the different variables types as explanatory variables can be used. No specific form for the underlying relationship is assumed.

New cards
2

What is bagging?

1. Bagging (Bootstrap Aggregating)

Key Idea: Reduce variance by averaging multiple independent models trained on bootstrapped datasets.

  • How it works:

    • Generate multiple bootstrapped samples from the original dataset.

    • Train a separate model (usually decision trees) on each sample.

    • Aggregate predictions (average for regression, majority vote for classification).

  • Purpose: Decrease overfitting and improve stability.

  • Strengths: Reduces variance, works well with high-variance models (e.g., deep decision trees).

  • Weaknesses: Does not improve bias significantly.

Example: Bagging Decision Trees
Averaging multiple decision trees trained on different subsets of the data to make a more stable prediction.

New cards
3

What is random forest?

2. Random Forest

Key Idea: An extension of bagging that introduces additional randomness to reduce correlation among trees.

  • How it works:

    • Same as bagging, but each tree is built using a random subset of features at each split (not all features).

    • This further decorrelates the trees, making the ensemble more robust.

  • Purpose: Reduces both variance and correlation between trees.

  • Strengths: prevents overfitting better than individual decision trees.

  • Weaknesses: Can be computationally expensive for large datasets.

New cards
4

What does boosting do?

3. Boosting

Key Idea: Reduce both bias and variance by training models sequentially, where each new model focuses on the mistakes of the previous ones.

  • How it works:

    • Train a weak model (e.g., a shallow tree).

    • Identify misclassified instances and assign them higher weights.

    • Train the next model to correct those mistakes.

    • Combine models to make a final prediction.

  • Purpose: Reduces bias, improves predictive power, and works well with weak learners.

  • Strengths: High accuracy, especially for complex datasets.

  • Weaknesses: More prone to overfitting than bagging/random forest, sensitive to noise.

New cards
5

When do you use bagging. random forest or boosting?

  • Bagging → When you have a high-variance model (e.g., deep decision trees) and want to stabilize predictions.

  • Random Forest → When you need a strong, robust model that handles high-dimensional data with reduced correlation.

  • Boosting → When you need the highest accuracy and can afford careful tuning to avoid overfitting.

New cards
robot