1/19
This set of flashcards covers key terms and concepts related to decision trees, Gini Index, and their implementation in Python.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Decision Tree
A machine learning model based on a tree structure used for decision-making or prediction.
Node
Represents a condition on an attribute in a decision tree.
Leaf
Contains the final predicted value in a decision tree.
Classification
Determining the group of an object based on input data.
Regression
Predicting a numerical value based on input data.
Gini Index
A measure of dataset purity used to determine the best attribute to split in a decision tree.
Gini Calculation Formula
Gini = 1 − Σ(pi^2), where pi is the ratio of samples belonging to class i.
Dataset
A collection of data used for training decision trees.
Splitting Data
Dividing the dataset into smaller groups based on an attribute.
Best Split
Selecting the attribute and threshold that minimize the Gini Index.
TreeNode
A class representing a node in the decision tree.
fit() method
Trains the decision tree by building the tree from the dataset.
print_tree() method
Displays the decision tree in a hierarchical manner.
Max Depth
The maximum allowed depth of the decision tree.
Python Implementation
Using Python to create functions and classes for building decision trees.
OOP (Object-Oriented Programming)
A programming paradigm used in building the decision tree structure.
Purity of Dataset
A measure of how homogenous a dataset is concerning classification.
Proportion
The ratio of class observations to the total number of observations.
Leaf Node
A node that has no further children, representing a classification decision.
Threshold
A value used to divide the dataset for decision tree splits.