Lecture on Naïve Bayes, Decision Tree, and Ensemble Classifiers
Lecture Overview
Topic: Machine Learning (SEIS 763, Fall 2025)
Subtopics: Naïve Bayes, Decision Tree, Ensemble Classifiers
Naïve Bayes
Joint Probabilities
Definition: The probability of two (or more) events happening simultaneously.
Notation: For events A and B, the joint probability is denoted as P(A ∩ B).
Independent Events:
If A and B are independent, then:
P(A ∩ B) = P(A) imes P(B)
Dependent Events:
For dependent events, the formula is:
P(A ∩ B) = P(A) imes P(B|A)
Examples of Joint Probabilities
Example of Independent Events:
Event A: Rolling a 3 on the first roll.
Event B: Rolling a 5 on the second roll.
Calculations:
P(A) = rac{1}{6}
P(B) = rac{1}{6}
P(A ∩ B) = rac{1}{6} imes rac{1}{6} = rac{1}{36}
Example of Dependent Events:
Event A: Drawing an Ace on the first draw.
Event B: Drawing a King on the second draw.
Calculations:
P(A) = rac{1}{13}
P(B|A) = rac{4}{51}
P(A ∩ B) = P(A) imes P(B|A) = rac{1}{13} imes rac{4}{51} = rac{4}{663}
Bayes Theorem
Formula Derivation:
P(A|B) = rac{P(A ∩ B)}{P(B)}
Rearranged:
P(A|B) = rac{P(B|A) imes P(A)}{P(B)}
Application of Bayes Theorem:
Enables calculation of probabilities of A given B and vice versa.
Definitions:
P(A|B): The probability of event A occurring given that event B has occurred.
P(B|A): The probability of event B occurring given that event A has occurred.
P(A): The probability of event A occurring independently of event B.
P(B): The probability of event B occurring independently of event A.
Naïve Bayes Classifier
Formula:
P(y|X) = rac{P(X|y) imes P(y)}{P(X)}
Naïve Bayes Assumption:
All features are independent of each other.
P(X|y) = P(x1, x2, …, xn | y) = extstyleigg( extstyleigprod{i=1}^{n}P(x_i|y)
Calculation of Likelihood:
For each feature given class y:
P(xi|y) = rac{ ext{Number of instances with } xi ext{ and class } y}{ ext{Number of instances with class } y}
Calculation of Prior Probability:
P(y) = rac{ ext{Number of instances with class } y}{ ext{Total number of instances}}
Example: Weather Dataset
Attributes:
Outlook, Temp, Humidity, Windy, Play.
| Outlook | Temp | Humidity | Windy | Play |
|---------|------|----------|-------|------|
| Sunny | Hot | High | False | No |
| Sunny | Hot | High | True | No |
| Overcast| Hot | High | False | Yes |
| Rainy | Mild | High | False | Yes |
| Rainy | Cool | Normal | False | Yes |
| … | … | … | … | … |
Probability Calculations:
Probabilities for Weather Data (Examples)
Outlook SUNNY:
Yes: 2, No: 3
Probability: 2/9 for Yes, 3/5 for No.
Rainy:
Yes: 3, No: 2.
Probability: 3/9 for Yes, 2/5 for No.
Calculation for Conditional Probabilities:
Given class = Yes, what about Windy??
Numeric Data
Assumption: Normal distribution (Naive)
Density Function:
f(x) = rac{1}{ ext{sqrt}(2 ext{pi} ext{σ})}e^{- rac{(x- ext{μ})^2}{2 ext{σ}^2}}
Decision Tree Classifier
Goal: Best attribute selection based on various impurities.
Information Gain: Determines the purity of the dataset after splits.
Entropy:
Measure of impurity or disorder in the data.
ext{Entropy}(p1, p2, …, pn) = -p1 ext{log}(p1) - p2 ext{log}(p2) - … - pn ext{log}(p_n)
Example Calculation:
For attribute classification based on weather attributes, measurements of impurity help decide on further splitting decisions.
Ensemble Learning
Bagging and Boosting Techniques
Random Forest: Combines multiple decision trees to improve overall performance.
Process:
Randomly sample data instances from the training set with repetition.
Build a decision tree from this sample.
Repeat the above steps for the desired number of trees.
For classification, take the majority vote from all the trees.
Code Example for Random Forest
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
model.fit(X_train, y_train)
Mock Exam Preparation
Important Correspondences:
Review key lecture points and examples discussed in class.
Prepare for practical applications of the concepts discussed, particularly in Naïve Bayes and Decision Trees.
Assignments
Project Proposal due: 10/31
Assignment 6 due