SRM Exam Chapter 5: Decision Trees

studied byStudied by 0 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions
Get a hint
Hint

Regression and Classification Trees:
R

Get a hint
Hint

Region of predictor space

Get a hint
Hint

Regression and Classification Trees:
n_m

Get a hint
Hint

Number of observations in node m

1 / 58

Anonymous user
Anonymous user
encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

59 Terms

1

Regression and Classification Trees:
R

Region of predictor space

New cards
2

Regression and Classification Trees:
n_m

Number of observations in node m

New cards
3

Regression and Classification Trees:
n_(m,c)

Number of category c observations in node m

New cards
4

Regression and Classification Trees:
I

Impurity

New cards
5

Regression and Classification Trees:
E

Classification error rate

New cards
6

Regression and Classification Trees:
G

Gini index

New cards
7

Regression and Classification Trees:
D

Cross entropy

New cards
8

Regression and Classification Trees:
T

Subtree

New cards
9

Regression and Classification Trees:
|T|

Number of terminal nodes in T

New cards
10

Regression and Classification Trees:
๐œ†

Tuning parameter

New cards
11

Decision Tree

Visually shows the partitions within a predictor space.

New cards
12

The decision tree provides an intuitive way to predict the response for ________________.

new observations

New cards
13

Decision Trees:
Left Branch

Statement is true

New cards
14

Decision Trees:
Right Branch

Statement is false

New cards
15

Regression and Classification Trees:
Algorithm
1. Construct a large tree with ๐‘” terminal nodes using ______________________.
2. Obtain a sequence of best subtrees, as a function of ๐œ†, using _________________________.
3. Choose ๐œ† by applying ___________________. Select the ๐œ† that results in the lowest _______________________.
4. The best subtree is the subtree created in step 2 with the selected ____________.

recursive binary splitting
cost complexity pruning
๐‘˜-fold cross validation
cross-validation error
๐œ† value

New cards
16

Recursive Binary Splitting:
Classification
Minimize

1/๐‘› โˆ‘ (m=1 to g) { ๐‘›_m โ‹… ๐ผ_m }

New cards
17

Recursive Binary Splitting:
Classification
๐‘ฬ‚_(m,c) =

= ๐‘›_(m,c)โ„๐‘›_m

New cards
18

Recursive Binary Splitting:
Classification
๐ธ_m =

= 1 โˆ’ max_(c) {๐‘ฬ‚_(m,c) }

New cards
19

Recursive Binary Splitting:
Classification
๐บ_m =

= โˆ‘ (c=1 to w) {๐‘ฬ‚_(m,c) โ€ข (1 โˆ’ ๐‘ฬ‚_(m,c))}

New cards
20

Recursive Binary Splitting:
Classification
๐ท_m =

= โˆ’โˆ‘ (c=1 to w) {๐‘ฬ‚_(m,c) โ€ข ln(๐‘ฬ‚_(m,c))}

New cards
21

Recursive Binary Splitting:
Classification
deviance =

= โˆ’2 โˆ‘ (m=1 to g) โˆ‘ (c=1 to w) {๐‘›_(m,c) โ€ข ln(๐‘ฬ‚_(m,c))}

New cards
22

Recursive Binary Splitting:
Classification
residual mean deviance =

= deviance / (๐‘› โˆ’ g)

New cards
23

Classification Error Rate
1. _______________ able to capture purity improvement.
2. Focuses on _________________________.
3. Preferred for __________________.

Less
misclassified observations
pruning trees (simpler tree with lower variance and higher prediction accuracy)

New cards
24

Gini Index and Cross Entropy
1. ______________ able to capture purity improvement.
2. Focuses on ______________________.
3. Preferred for ___________________.

More
maximizing node purity
growing trees

New cards
25

Cost Complexity Pruning:
_______________ or _______________ represent the partitions of the predictor space.

Terminal nodes
leaves

New cards
26

Cost Complexity Pruning:
__________________ are points along the tree where splits occur.

Internal nodes

New cards
27

Cost Complexity Pruning:
Terminal nodes do not have __________________, but internal nodes do.

child nodes

New cards
28

Cost Complexity Pruning:
__________________ are lines that connect any two nodes.

Branches

New cards
29

Cost Complexity Pruning:
A decision tree with only one internal node is called a ________________.

stump

New cards
30

Split Point

Midpoint of two unique consecutive values for a given explanatory variable

New cards
31

Recursive Binary Splitting Stopping Criterion

At least 5 observations in each terminal node.

New cards
32

Recursive binary splitting only produces _________________ regions.

rectangular

New cards
33

Recursive binary splitting usually produces ___________ and ____________ trees. The bigger the tree, the more terminal nodes there are, which means the more _____________ it is. This also means a higher chance of _____________ the data.

large
complex
flexible
overfitting

New cards
34

Advantages of Trees:
1. Easy to ___________ and _______________.
2. Can be presented _______________.
3. Manage ______________ variables without the need of _______________ variables.
4. Mimic _________________________.

interpret
explain
visually
categorical
dummy
human decision-making

New cards
35

Disadvantages of Trees:
1. Not _____________.
2. Do not have the same degree of __________________ as other statistical methods.

robust
predictive accuracy

New cards
36

Pruning

Remove internal nodes and all nodes following.

New cards
37

Linear models good for approximately ______________________.

linear relationships

New cards
38

Decision trees good for more _________________________.

complicated relationships

New cards
39

Bagging, Random Forests, and Boosting

Improve the predictive accuracy of trees.

New cards
40

Bootstrapping

Sampling with replacement to create artificial samples from the set of observations.

New cards
41

Bootstrapping
Original Set =
Bootstrap Samples =
Distinct Bootstrap Samples =

= n observations
= n^(n) samples
= (2n-1) choose (n-1) samples

New cards
42

Probability an observation is not selected as a bootstrap sample =
which converges to ________________ as n approaches ________________.

= (1 - 1/n)^n
1/e
infinity

New cards
43

Multiple Trees:
Bagging Steps
1. Create __________________________ from the original training dataset.
2. Construct a ______________ for each bootstrap sample using ______________________.
3. Predict the response of a new observation by ________________________ (regression trees) or by ____________________________ (classification trees) across all ๐‘ trees.

๐‘ bootstrap samples
decision tree (called bagged trees)
recursive binary splitting
averaging the predictions
using the most frequent category

New cards
44

Bagged trees are not _______________.

pruned

New cards
45

Bagging:
As b increases, model accuracy _____________, variance ________________ (due to bagging)

increases
decreases

New cards
46

Bagging:
Bagging makes it more _____________ to interpret the bagged model as a whole since we can not visualize all b bagged trees with a _________________. If we have a single tree (without bagging), we use ___________________ to estimate the test error.

difficult
single tree
cross-validation

New cards
47

Bagging Properties:
1. Increasing ๐‘ does not cause ________________.
2. Bagging reduces _____________________.
3. ___________________ is a valid estimate of test error (with bagging).

overfitting
variance
Out-of-bag error

New cards
48

Calculating the OOB Error for a Bagged Model:
1. For each bagged tree, predict the response for each ________________.
2. Summarize predictions and compute the OOB error as the ____________ for regression trees or the ___________________ for classification trees.
3. Graph the OOB error against the number of ________________.

out-of-bag observation
test MSE
test error rate
bagged trees (want the lowest OOB error)

New cards
49

Random Forests use similar bagged trees:

Increases correlations between predictions and diminishes the variance-reducing power of bagging.

New cards
50

Random Forests Steps:
1. Create _________________________ from the original training dataset.
2. Construct a __________________ for each bootstrap sample using ______________________. At each split, a random subset of ___________________ are considered.
3. Predict the response of a new observation by __________________________ (regression trees) or by _________________________________ (classification trees) across all ๐‘ trees.

๐‘ bootstrap samples
decision tree
recursive binary splitting
๐‘˜ variables
averaging the predictions
using the most frequent category

New cards
51

Random Forests Properties:
1. __________________ is a special case of random forests.
2. Increasing ๐‘ does not cause ____________________.
3. Decreasing ๐‘˜ reduces the _________________________________.

Bagging (k=p)
overfitting
correlation between predictions

New cards
52

Random Forests k values:
Regression Trees
k =

= p/3

New cards
53

Random Forests k values:
Classification Trees
k =

= sqrt(p)

New cards
54

OOB Error:
Bagging _______ Random Forests (not always though)

>

New cards
55

Performance
Bagging _______ Random Forests

<

New cards
56

Boosting does not involve _______________.

bootstrapping

New cards
57

Boosting grows trees ______________ using information from previous.

sequentially

New cards
58

Boosting Steps:

Let ๐‘ง_1 be the actual response variable, ๐‘ฆ.
1. For ๐‘˜ = 1, 2, ... , ๐‘:
โ€ข Use recursive binary splitting to fit a tree with ๐‘‘ splits to the data with ๐‘ง_k as the response.
โ€ข Update ๐‘ง_k by subtracting ๐œ† โ‹… ๐‘“^(hat)_(k) (๐ฑ),
i.e. let ๐‘ง_(k+1) = ๐‘ง_k โˆ’ ๐œ† โ‹… ๐‘“^(hat)_(k) (๐ฑ).
2. Calculate the boosted model prediction as
๐‘“^(hat) (๐ฑ) = โˆ‘ (k=1 to b) {๐œ† โ‹… ๐‘“^(hat)_(k) (๐ฑ) .

New cards
59

Boosting Properties:
1. Increasing ๐‘ can cause _________________.
2. Boosting reduces _________________.
3. ๐‘‘ controls _________________ of the boosted model.
4. ๐œ† controls the ____________ at which boosting _________________.

overfitting
bias
complexity
rate
learns

New cards
robot