Decision Trees: Entropy and Information Gain in Data Mining

studied byStudied by 0 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 24

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

25 Terms

1

Information Gain

Reduction in entropy after a data split.

New cards
2

Pure Group

Group with one dominant class present.

New cards
3

Impure Group

Group with no dominant class present.

New cards
4

Entropy Formula

H = -Σ(p_i * log2(p_i)).

<p>H = -Σ(p_i * log2(p_i)).</p>
New cards
5

Maximum Entropy

Occurs when all classes are equally probable.

New cards
6

Minimum Entropy

Occurs when one class is certain (p=1).

New cards
7

Decision Tree

Algorithm for supervised machine learning tasks.

<p>Algorithm for supervised machine learning tasks.</p>
New cards
8

Decision Node

Point where a decision is made in tree.

New cards
9

Leaf Node

Terminal node representing prediction output.

New cards
10

Entropy

Measure of uncertainty or impurity in data.

New cards
11

Parent Entropy

Entropy before data split in decision tree.

New cards
12

Child Entropy

Entropy after data split in decision tree.

New cards
13

Balance Feature

Feature indicating financial balance status.

New cards
14

Residence Feature

Feature indicating living situation (OWN, RENT, OTHER).

New cards
15

Information Gain Calculation

IG = H(parent) - H(children).

<p>IG = H(parent) - H(children).</p>
New cards
16

Entropy Value Range

Entropy ranges from 0 to log2(n).

New cards
17

Entropy Interpretation

Higher value indicates more uncertainty in data.

New cards
18

Sequential Decisions

Series of if-then rules for data classification.

New cards
19

Data Partitioning

Process of dividing data into subsets.

New cards
20

Weather Example

Entropy higher with 50% rain vs 100% rain.

New cards
21

Group A

70 smokers and 30 non-smokers.

New cards
22

Group B

85 smokers and 15 non-smokers.

New cards
23

Feature Split

Dividing data based on feature values.

New cards
24

Information Gain Example

Calculating IG for loan default prediction.

<p>Calculating IG for loan default prediction.</p>
New cards
25

Numerical Feature Binning

Dividing numerical ranges into bins for splits.

New cards
robot