03 - Decision Trees
Decision Trees: a tree model that splits the training set into subsets in which all examples have the same class
Root Node: all examples placed here initially
Child Node: root node is split into 3 or more child nodes containing subsets of examples
Purity: tree node is pure if all examples have the same class label
Good Features: divides the examples into categories of one class
Bad Features: produces categories of mixed classes
Feature Selection: find good features which divide examples into categories of a single class - use a measure of impurity to guide feature selection
Entropy: measure of uncertainty around a source of information, higher for more random sources (higher = more uncertainty)
ID3 Algorithm: repeatedly builds a decision tree from the top down
Information Gain: feature selection criterion which measures a feature’s overall impact on entropy when used to split a set of training examples into 2/more subsets
C4.5 Algorithm improved version of ID3 that handles continuous numeric data, training data with missing values, chooses an appropriate feature selection measure, enables pruning after creation