Classification
Classification is a form of regression
classification has a response variable that is a category
Machine learning algorithm
mathematical model
calculated based on sample data
training data
makes predictions or decisions without being explicitly programmed to perform the task
A nearest neighbor classifier categorizes an unknown point by using the majority classified around it
pythagorean theorem to find the distance between two points
to find the k nearest neighbors
-find the distance between the example and each example in the training set
Augment the training data table with a column containing all the distances
sort the augmented table in increasing order of the distances
Take the top k rows of the sorted table