Lecture 9: Decision Trees and Random Forest 2

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/9

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 5:19 PM on 4/28/25
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

10 Terms

1
New cards

Q: How do you pick a split in a decision tree?

Choose the variable and cutoff that best separates the data, usually using Gini impurity for classification.

2
New cards

Q: When do you stop growing a decision tree?

When nodes are pure (one label) or have too few data points.

3
New cards

Q: What is bootstrap aggregation (bagging)?

Resampling data with replacement, training models on each sample, and aggregating their results to reduce overfitting.

4
New cards

Q: Why does a fully grown single decision tree overfit?

It perfectly memorizes the training data, losing generalization.

5
New cards

Q: How do random forests improve over single decision trees?

They grow many trees on bootstrapped samples and average or vote across them to reduce variance.

6
New cards

Q: What extra randomness is added in random forests?

Each split considers a random subset of predictor variables instead of all variables.

7
New cards

Q: In Machine Learning, what are the three key steps?

Train on past data, predict on new data, evaluate performance.

8
New cards

Q: Why use ensemble models?

Because no single model is perfect; combining models can improve accuracy.

9
New cards

Q: What is stacking in ensemble modeling?

Using outputs from different models as new features for a final model.

10
New cards

Q: How does linear weighted stacking work?

  1. Split training data into two parts (train1 and train2).
    1. Train several models (e.g., Random Forest, GLM, GBM, SVM) on train1.
    2. Score each model on train2, using the scores as new features.
    3. Combine these new features with original features on train2.
    4. Train a final GLM model (including interaction terms) on this combined data.
    5. Apply the stacked pipeline to test data

Explore top notes

note
Latest Face
Updated 1204d ago
0.0(0)
note
Thrill seeking
Updated 678d ago
0.0(0)
note
CEE_A2_1.30
Updated 427d ago
0.0(0)
note
Chapter 4: Enzymes
Updated 935d ago
0.0(0)
note
Chapter 6: Separation Methods
Updated 1077d ago
0.0(0)
note
GDP Per Capita Varies
Updated 1140d ago
0.0(0)
note
Chapter 10: Muscle Tissue
Updated 1069d ago
0.0(0)
note
Latest Face
Updated 1204d ago
0.0(0)
note
Thrill seeking
Updated 678d ago
0.0(0)
note
CEE_A2_1.30
Updated 427d ago
0.0(0)
note
Chapter 4: Enzymes
Updated 935d ago
0.0(0)
note
Chapter 6: Separation Methods
Updated 1077d ago
0.0(0)
note
GDP Per Capita Varies
Updated 1140d ago
0.0(0)
note
Chapter 10: Muscle Tissue
Updated 1069d ago
0.0(0)

Explore top flashcards

flashcards
macbeth study guide act 3
34
Updated 834d ago
0.0(0)
flashcards
U5 Ir de compras
86
Updated 753d ago
0.0(0)
flashcards
Onc lec 1
54
Updated 464d ago
0.0(0)
flashcards
Krok Po Kroku Lekcja 01-05
20
Updated 667d ago
0.0(0)
flashcards
UNIT 4 REVIEW HUMAN GEOGRAPHY
47
Updated 1056d ago
0.0(0)
flashcards
biology genetics
36
Updated 15d ago
0.0(0)
flashcards
spieren bovenste lidmaat
43
Updated 343d ago
0.0(0)
flashcards
macbeth study guide act 3
34
Updated 834d ago
0.0(0)
flashcards
U5 Ir de compras
86
Updated 753d ago
0.0(0)
flashcards
Onc lec 1
54
Updated 464d ago
0.0(0)
flashcards
Krok Po Kroku Lekcja 01-05
20
Updated 667d ago
0.0(0)
flashcards
UNIT 4 REVIEW HUMAN GEOGRAPHY
47
Updated 1056d ago
0.0(0)
flashcards
biology genetics
36
Updated 15d ago
0.0(0)
flashcards
spieren bovenste lidmaat
43
Updated 343d ago
0.0(0)