Lecture on Naïve Bayes, Decision Tree, and Ensemble Classifiers

Definition: The probability of two (or more) events happening simultaneously.
Notation: For events A and B, the joint probability is denoted as P(A ∩ B).
Independent Events:
- If A and B are independent, then:
  P(A ∩ B) = P(A) imes P(B)
Dependent Events:
- For dependent events, the formula is:
  P(A ∩ B) = P(A) imes P(B|A)

Example of Independent Events:
- Event A: Rolling a 3 on the first roll.
- Event B: Rolling a 5 on the second roll.
- Calculations:
  - P(A) = rac{1}{6}
  - P(B) = rac{1}{6}
  - P(A ∩ B) = rac{1}{6} imes rac{1}{6} = rac{1}{36}
Example of Dependent Events:
- Event A: Drawing an Ace on the first draw.
- Event B: Drawing a King on the second draw.
- Calculations:
  - P(A) = rac{1}{13}
  - P(B|A) = rac{4}{51}
  - P(A ∩ B) = P(A) imes P(B|A) = rac{1}{13} imes rac{4}{51} = rac{4}{663}

Formula Derivation:
- P(A|B) = rac{P(A ∩ B)}{P(B)}
- Rearranged:
  - P(A|B) = rac{P(B|A) imes P(A)}{P(B)}
Application of Bayes Theorem:
- Enables calculation of probabilities of A given B and vice versa.
- Definitions:
  - P(A|B): The probability of event A occurring given that event B has occurred.
  - P(B|A): The probability of event B occurring given that event A has occurred.
  - P(A): The probability of event A occurring independently of event B.
  - P(B): The probability of event B occurring independently of event A.

Formula:
- P(y|X) = rac{P(X|y) imes P(y)}{P(X)}
Naïve Bayes Assumption:
- All features are independent of each other.
- P(X|y) = P(x1, x2, …, xn | y) = extstyleigg( extstyleigprod{i=1}^{n}P(x_i|y)
Calculation of Likelihood:
- For each feature given class y:
- P(xi|y) = rac{ ext{Number of instances with } xi ext{ and class } y}{ ext{Number of instances with class } y}
Calculation of Prior Probability:
- P(y) = rac{ ext{Number of instances with class } y}{ ext{Total number of instances}}

Attributes:
- Outlook, Temp, Humidity, Windy, Play.
  | Outlook | Temp | Humidity | Windy | Play |
  |---------|------|----------|-------|------|
  | Sunny | Hot | High | False | No |
  | Sunny | Hot | High | True | No |
  | Overcast| Hot | High | False | Yes |
  | Rainy | Mild | High | False | Yes |
  | Rainy | Cool | Normal | False | Yes |
  | … | … | … | … | … |
Probability Calculations:

Outlook SUNNY:
- Yes: 2, No: 3
- Probability: 2/9 for Yes, 3/5 for No.
Rainy:
- Yes: 3, No: 2.
- Probability: 3/9 for Yes, 2/5 for No.
Calculation for Conditional Probabilities:
- Given class = Yes, what about Windy??

Assumption: Normal distribution (Naive)
Density Function:
- f(x) = rac{1}{ ext{sqrt}(2 ext{pi} ext{σ})}e^{- rac{(x- ext{μ})^2}{2 ext{σ}^2}}

Goal: Best attribute selection based on various impurities.
Information Gain: Determines the purity of the dataset after splits.
Entropy:
- Measure of impurity or disorder in the data.
- ext{Entropy}(p1, p2, …, pn) = -p1 ext{log}(p1) - p2 ext{log}(p2) - … - pn ext{log}(p_n)
Example Calculation:
- For attribute classification based on weather attributes, measurements of impurity help decide on further splitting decisions.

Random Forest: Combines multiple decision trees to improve overall performance.
- Process:
1. Randomly sample data instances from the training set with repetition.
2. Build a decision tree from this sample.
3. Repeat the above steps for the desired number of trees.
4. For classification, take the majority vote from all the trees.

from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier()
model.fit(X_train, y_train)

Important Correspondences:
- Review key lecture points and examples discussed in class.
- Prepare for practical applications of the concepts discussed, particularly in Naïve Bayes and Decision Trees.