Ensemble Learning & Random Forest

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/11

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

12 Terms

1
New cards

Decision trees

widely used models for classification and regression. Internally they’re a flow-like structure in each node represents a tests on an attribute. When trained they learn a hierarchy of if/else questions that lead to a decision

2
New cards

Training

3
New cards

Structure

graphviz package. Gini impurity

4
New cards

Gini impurity

measures the nodes purity

5
New cards

CART Training algorithm

Classification and Regression Tree

6
New cards

CART Objective

select feature k and threshold t to form the rule: if (k<t) then left else right. Search for all pairs (kt_k) and select the one that minimised the cost function

7
New cards

Entropy

derived from thermodynamics as a measure of disorder. A measure of information in a system calculated based on the probabilities of different situations within the systems

8
New cards

Gini v Entropy

faster to compute. tends to isolate the most frequent class in its own branch

9
New cards

Entropy v Gini

tends to produce more balanced trees

10
New cards

Hyperparameter optimisation

automate the process of testing different combinations of our model’s parameters. Grid Search

11
New cards

Grid search

greedily search all combinations in parameters we want to explore for each combination it will train and test with cross-validation to estimate the performance of the model

12
New cards

Randomised Search

an alternative model that tests random combinations of the hyperparameters it selects random values for each parameter to perform tests then specifies the no. iterations and it will attempt random combinations for each iteration which is appropriate if you have multiple parameters with a wide range of possible values