Decision Trees: Entropy and Information Gain in Data Mining

Studied by 0 people

0.0(0)

LearnA personalized and smart learning plan

Practice TestTake a test on your terms and definitions

Spaced RepetitionScientifically backed study method

Matching GameHow quick can you match all your cards?

FlashcardsStudy terms and definitions

1 / 24

There's no tags or description

Looks like no one added any tags here yet for you.

25 Terms

Information Gain

Reduction in entropy after a data split.

New cards

Pure Group

Group with one dominant class present.

New cards

Impure Group

Group with no dominant class present.

New cards

Entropy Formula

H = -Σ(p_i * log2(p_i)).

New cards

Maximum Entropy

Occurs when all classes are equally probable.

New cards

Minimum Entropy

Occurs when one class is certain (p=1).

New cards

Decision Tree

Algorithm for supervised machine learning tasks.

New cards

Decision Node

Point where a decision is made in tree.

New cards

Leaf Node

Terminal node representing prediction output.

New cards

Entropy

Measure of uncertainty or impurity in data.

New cards

Parent Entropy

Entropy before data split in decision tree.

New cards

Child Entropy

Entropy after data split in decision tree.

New cards

Balance Feature

Feature indicating financial balance status.

New cards

Residence Feature

Feature indicating living situation (OWN, RENT, OTHER).

New cards

Information Gain Calculation

IG = H(parent) - H(children).

New cards

Entropy Value Range

Entropy ranges from 0 to log2(n).

New cards

Entropy Interpretation

Higher value indicates more uncertainty in data.

New cards

Sequential Decisions

Series of if-then rules for data classification.

New cards

Data Partitioning

Process of dividing data into subsets.

New cards

Weather Example

Entropy higher with 50% rain vs 100% rain.

New cards

Group A

70 smokers and 30 non-smokers.

New cards

Group B

85 smokers and 15 non-smokers.

New cards

Feature Split

Dividing data based on feature values.

New cards

Information Gain Example

Calculating IG for loan default prediction.

New cards

Numerical Feature Binning

Dividing numerical ranges into bins for splits.

New cards