Ch8: Instance-based Learning

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/17

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

18 Terms

1
New cards

What is instance-based learning?

A:

A learning method where the model stores the training instances & delays generalization until a new query is received.

It classifies new instances based on their similarity to stored data.

2
New cards

How does instance-based learning differ from previous models like decision trees or neural nets?

A: Previous models are eager learners that generalize during training; instance-based learning is a lazy learner that postpones generalization until prediction time.

3
New cards

What is lazy learning?

A: A learning approach that stores training data and performs computation only when making predictions, instead of learning a general model in advance.

4
New cards

What are some practical challenges with lazy learning?

A: High storage requirements, slow prediction time, difficulty handling noise, and the need to classify new data without an exact match.

5
New cards

What is the Nearest Neighbour (NN) learning method?

A: The most basic instance-based method; it assigns to a new instance the target value of the closest training instance.

6
New cards

How is similarity typically measured in NN learning?

Using Euclidean distance between feature vectors:

7
New cards

What is a key drawback of 1-Nearest Neighbour?

A: It is highly sensitive to local structure and noise, which can lead to overfitting.

8
New cards

What is k-Nearest Neighbour (k-NN)?

A: An extension of NN that assigns to a new instance the most common target value among the k nearest instances. For regression, it takes the mean of their target values.

9
New cards

What is the benefit of using k > 1 in k-NN?

A: Reduces sensitivity to noise and overfitting by smoothing the prediction across multiple nearby instances.

10
New cards

What is the trade-off when choosing the value of k in k-NN?

A: Small k (k neighbors=1) values may overfit (sensitive to noise), while large k (k=20) values may oversmooth (high bias).

11
New cards

What is Distance-Weighted k-NN?

A: A variant of k-NN where closer neighbors are given more influence in prediction than farther ones, based on distance.

12
New cards

What is the decision boundary in k-NN learning?

A: The implicit border between regions where a new point would be classified differently, determined by proximity to training examples.

13
New cards

What is a Voronoi diagram in the context of NN?

A geometric representation where each training point "owns" a region of the feature space based on proximity; new points are classified by the region they fall in.

14
New cards

What are the advantages of instance-based learning?

  • Simple to implement

  • No training time

  • Handles both discrete and continuous outputs

  • Can model complex target functions

  • Robust to noise (especially with distance weighting)

15
New cards

What are the disadvantages of instance-based learning?

A:

  • Expensive to predict (slow at classification time)

  • High memory usage

  • Sensitive to irrelevant features

  • Poor performance in high-dimensional spaces

16
New cards

Why does high dimensionality hurt k-NN?

A: Because distances become less meaningful, irrelevant features distort similarity, and the model requires many more training points to maintain coverage.

17
New cards

What are possible solutions to the curse of dimensionality in k-NN?

A:

  • Feature selection to remove irrelevant features

  • Attribute weighting

  • Dimensionality reduction (e.g., PCA)

18
New cards

How does decision trees differ from k-NN in dealing with features?

A: Decision trees naturally focus on the most relevant features, while k-NN considers all features equally in distance calculations unless adjusted.