Decision Tree Concepts and Implementation

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/19

flashcard set

Earn XP

Description and Tags

This set of flashcards covers key terms and concepts related to decision trees, Gini Index, and their implementation in Python.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

20 Terms

1
New cards

Decision Tree

A machine learning model based on a tree structure used for decision-making or prediction.

2
New cards

Node

Represents a condition on an attribute in a decision tree.

3
New cards

Leaf

Contains the final predicted value in a decision tree.

4
New cards

Classification

Determining the group of an object based on input data.

5
New cards

Regression

Predicting a numerical value based on input data.

6
New cards

Gini Index

A measure of dataset purity used to determine the best attribute to split in a decision tree.

7
New cards

Gini Calculation Formula

Gini = 1 − Σ(pi^2), where pi is the ratio of samples belonging to class i.

8
New cards

Dataset

A collection of data used for training decision trees.

9
New cards

Splitting Data

Dividing the dataset into smaller groups based on an attribute.

10
New cards

Best Split

Selecting the attribute and threshold that minimize the Gini Index.

11
New cards

TreeNode

A class representing a node in the decision tree.

12
New cards

fit() method

Trains the decision tree by building the tree from the dataset.

13
New cards

print_tree() method

Displays the decision tree in a hierarchical manner.

14
New cards

Max Depth

The maximum allowed depth of the decision tree.

15
New cards

Python Implementation

Using Python to create functions and classes for building decision trees.

16
New cards

OOP (Object-Oriented Programming)

A programming paradigm used in building the decision tree structure.

17
New cards

Purity of Dataset

A measure of how homogenous a dataset is concerning classification.

18
New cards

Proportion

The ratio of class observations to the total number of observations.

19
New cards

Leaf Node

A node that has no further children, representing a classification decision.

20
New cards

Threshold

A value used to divide the dataset for decision tree splits.