Class Notes - Naive Bayes and KNN Algorithm Discussion

Instructor: Ferdi Eruysal
Attendance: Students must sign an attendance sheet circulating in the class.
Reminder: Upcoming assignment due next week on Sunday.
- Students are encouraged to start early, as it may take hours to complete.

Introduction to the Naive Bayes algorithm.
- Explanation duration: 20-25 minutes.
Recap of the K-Nearest Neighbors (KNN) algorithm.
Important note on data normalization prior to using KNN.
In-class exercise: Work on a long dataset to build different KNN models.
- Allocate 225 minutes for this exercise.

Students are expected to create their own groups for team projects (max 5 members).
Procedure to form teams:
- Go to the "People" tab on the platform.
- Select "Team Projects" and pick an empty group.
- Note: Members must be in the same section (not mixing with students from different sections).
Any student not assigned a group will be randomly assigned after two weeks.

Importance of understanding probability for the Naive Bayes algorithm.
Basic concepts of probability concerning dice:
- A (Ex: Probability of rolling a 5 or 6).
- B (Probability of rolling a 5).
- C (Probability of rolling a 3).
- Probabilities calculated as follows:
- Probability of A: 2/6 = 1/3 for rolling either 5 or 6.
- Probability of B: 1/6.
- Probability of C: 1/6.
Application of probabilities: Predicting loan defaults based on specific conditions (e.g., age, income).

Definition: Probability of event B occurring given that event A has occurred.
Notation: P(B|A)
- If A already happened, what is the probability of B happening?
- Example: Given A (rolling a 5 or 6), find P(A|B).
Use of conditional probabilities in predictions.
- Colored outcome example with coins illustrating how observing one outcome affects predictions about another.

Envelope Example:
- Two envelopes: one with a dollar and one without.
- Probability of picking the correct envelope before seeing its contents is 1/2 (50%).
- Upon revealing information (picking one of the envelopes), the chances can change
- Instance of how additional information can refine predictions in machine learning.

Example of filtering out spam emails:
- Begin with a histogram of words from normal messages.
- Calculate the probabilities for words like "dear" in the normal messages (e.g. P(dear|normal) = 0.47).
- Do the same for spam messages.
Introduction of prior probabilities based on message classification (normal vs spam).
- Normal messages prior probability: 0.67 based on training data.
- Spam messages prior probability: 0.33 on similar grounds.

Spam score calculation for the message "Dear Friend":
- Normal message score: 0.09 and spam score: 0.01.
- The message classed as normal since the score for normal exceeds that of spam.

Importance of word choice in emails to avoid spam filters.
Cautious about words like “money” as they may trigger spam detection algorithms.

Emphasis on normalizing data before KNN implementations to prevent features with larger scales from dominating the distance calculations.
Visualization of data before and after normalization to illustrate its effect.

Reminder: KNN requires normalization to ensure all features contribute equally.
Upcoming tasks include completing the assignment and optimizing model parameters.
Encourage students to submit assignments early and practice with the material learned in class.