03 - Decision Trees

Decision Trees: a tree model that splits the training set into subsets in which all examples have the same class

Root Node: all examples placed here initially

Child Node: root node is split into 3 or more child nodes containing subsets of examples

Purity: tree node is pure if all examples have the same class label

Good Features: divides the examples into categories of one class

Bad Features: produces categories of mixed classes

Feature Selection: find good features which divide examples into categories of a single class - use a measure of impurity to guide feature selection

Entropy: measure of uncertainty around a source of information, higher for more random sources (higher = more uncertainty)

ID3 Algorithm: repeatedly builds a decision tree from the top down

Information Gain: feature selection criterion which measures a feature’s overall impact on entropy when used to split a set of training examples into 2/more subsets

C4.5 Algorithm improved version of ID3 that handles continuous numeric data, training data with missing values, chooses an appropriate feature selection measure, enables pruning after creation