4.3 Nearest Neighbor Classification

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/23

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 6:09 PM on 6/22/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

24 Terms

1
New cards

Why is the k-Nearest Neighbor algorithm considered a "Lazy Learner"?

Because no explicit model is learned and the classifier uses the training samples directly.

2
New cards

What does "learning by analogy" mean in the k-NN algorithm?

It means finding the most similar training samples (the "decision set") and classifying the new sample based on the majority of those samples.

3
New cards

What is used to determine the similarity between samples in the k-NN algorithm?

A distance function.

4
New cards

What does the hyperparameter 'k' represent in the k-nearest Neighbour algorithm?

It represents how many similar samples are being searched for.

5
New cards

What is the "decision set" in a k-Nearest Neighbor classifier?

The group of the most similar training samples found to classify a new sample.

6
New cards
What is the problem with choosing a "too small" value for the hyperparameter k in the k-NN algorithm?
It does not generalize enough, leading to a high sensitivity to outliers.
7
New cards
What happens when you choose a "too large" value for the hyperparameter k in the k-NN algorithm?
It generalizes too much, causing many objects from other clusters or classes to be included in the decision set.
8
New cards
What range for the hyperparameter k typically yields the highest classification quality?
An average k, often 1 << k < 10.
9
New cards
In a k-NN visual representation, what happens to the decision set boundary as k increases (e.g., from k=1 to k=17)?
The boundary expands outward to include more neighboring data points, increasing the likelihood of capturing points from different classes.
10
New cards
What is the first step in hyperparameter selection for a parameter like k?
Try different values for the hyperparameter (e.g., k = 1, 2, 3...).
11
New cards
How do you evaluate the effectiveness of each tested hyperparameter value?
By determining the classification error for each value.
12
New cards
What are two common methods used to determine the classification error for a specific hyperparameter?
1) Evaluating it on a validation set, or 2) Using m-fold cross-validation.
13
New cards
After calculating the errors for various hyperparameter values, which one is selected for the final model?
The value that yields the best classification results (the highest accuracy or lowest error).
14
New cards
When looking at a graph of "Number of Neighbors (k) vs Accuracy", what is the primary goal?
To identify the peak of the graph, which represents the optimal k value that maximizes classification accuracy.
15
New cards
What is the default decision rule in a k-NN classifier?
Select the majority class of the decision set.
16
New cards
How does a weighted decision rule differ from the default rule in k-NN?
Instead of a simple majority vote, it weighs the training samples in the decision set based on specific criteria.
17
New cards
What are three common criteria used to weight samples in a k-NN decision set?
By distance to the sample, by class distribution, or by a cost function.
18
New cards
Why might you weight k-NN samples by class distribution?
To account for highly unequal distributions, preventing a rare class from being automatically outvoted by a dominant majority class.
19
New cards
When is it appropriate to use a cost function to weight a k-NN decision rule?
When predicting one class incorrectly (e.g., predicting A when it's actually B) is considered "worse" or more costly than the reverse.
20
New cards
What is a major advantage of the k-NN algorithm regarding its general performance?
It typically achieves high classification accuracy in many applications.
21
New cards
Can the k-NN algorithm be used for continuous targets, or is it strictly for classification?
Variants of k-NN can also be used for the prediction of numerical values (regression).
22
New cards
Why is "incrementality" considered an advantage of the k-NN algorithm?
Because it is a lazy learner, the model does not need to be adapted or retrained when new training data is added.
23
New cards
Why is the k-NN algorithm considered inefficient during the actual classification phase?
Because it must calculate the distance against all existing training samples to classify just one new sample.
24
New cards
What is a major disadvantage of k-NN regarding interpretability and insights?
It returns no explicit knowledge or structural rules about the classes.