CS3001 - Clustering

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/9

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 2:23 AM on 5/15/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

10 Terms

1
New cards

What is the first step in the K-Means Algorithm?

1. Each data point starts as its own cluster (N clusters total).

2. Find the 2 closest clusters and merge them into one.

3. Recalculate distances between the new cluster and all remaining clusters.

4. Repeat until only 1 cluster remains. Result: a DENDROGRAM (tree diagram).

2
New cards

Advantages of the K-Means Algorithm?

Fast and simple

Good for round (globular) clusters

Scales well to large datasets

3
New cards

Disadvantages of the K-Means Algorithm?

Must specify K in advance

Different random starts can give different results

Cannot handle oddly-shaped clusters

4
New cards

Steps in Hierarchical Clustering

1. Each data point starts as its own cluster (N clusters total).

2. Find the 2 closest clusters and merge them into one.

3. Recalculate distances between the new cluster and all remaining clusters.

4. Repeat until only 1 cluster remains. Result: a DENDROGRAM (tree diagram).

5
New cards

How do you determine the number of clusters from a dendrogram?

Cut the dendrogram with a horizontal line; the number of vertical lines crossed equals the number of clusters.

6
New cards

What does Single Linkage measure in Hierarchical Clustering?

MINIMUM — closest pair between the two clusters

Single = Smallest gap

7
New cards

What does Complete Linkage measure in Hierarchical Clustering?

MAXIMUM — farthest pair between the two clusters

Complete = Widest gap

8
New cards

What does Average Linkage measure in Hierarchical Clustering?

The mean distance of all pairs between two clusters.

Average = Middle ground

9
New cards

Advantages of Hierarchical Clustering

No need to choose K in advance.

Dendrogram reveals full cluster structure

Works with any cluster shape

10
New cards

Disadvantages of Hierarchical Clustering

It is slow on large datasets.

Early merges cannot be undone (greedy)

Different linkage methods give different results