Decision Trees: Entropy and Information Gain in Data Mining

0.0(0)
studied byStudied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/24

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 12:45 AM on 12/15/24
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

25 Terms

1
New cards

Information Gain

Reduction in entropy after a data split.

2
New cards

Pure Group

Group with one dominant class present.

3
New cards

Impure Group

Group with no dominant class present.

4
New cards

Entropy Formula

H = -Σ(p_i * log2(p_i)).

<p>H = -Σ(p_i * log2(p_i)).</p>
5
New cards

Maximum Entropy

Occurs when all classes are equally probable.

6
New cards

Minimum Entropy

Occurs when one class is certain (p=1).

7
New cards

Decision Tree

Algorithm for supervised machine learning tasks.

<p>Algorithm for supervised machine learning tasks.</p>
8
New cards

Decision Node

Point where a decision is made in tree.

9
New cards

Leaf Node

Terminal node representing prediction output.

10
New cards

Entropy

Measure of uncertainty or impurity in data.

11
New cards

Parent Entropy

Entropy before data split in decision tree.

12
New cards

Child Entropy

Entropy after data split in decision tree.

13
New cards

Balance Feature

Feature indicating financial balance status.

14
New cards

Residence Feature

Feature indicating living situation (OWN, RENT, OTHER).

15
New cards

Information Gain Calculation

IG = H(parent) - H(children).

<p>IG = H(parent) - H(children).</p>
16
New cards

Entropy Value Range

Entropy ranges from 0 to log2(n).

17
New cards

Entropy Interpretation

Higher value indicates more uncertainty in data.

18
New cards

Sequential Decisions

Series of if-then rules for data classification.

19
New cards

Data Partitioning

Process of dividing data into subsets.

20
New cards

Weather Example

Entropy higher with 50% rain vs 100% rain.

21
New cards

Group A

70 smokers and 30 non-smokers.

22
New cards

Group B

85 smokers and 15 non-smokers.

23
New cards

Feature Split

Dividing data based on feature values.

24
New cards

Information Gain Example

Calculating IG for loan default prediction.

<p>Calculating IG for loan default prediction.</p>
25
New cards

Numerical Feature Binning

Dividing numerical ranges into bins for splits.

Explore top notes