Clustering-Intro

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/9

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

10 Terms

1
New cards

What is clustering?

Clustering is the organization of unlabeled data into similarity groups called clusters.

2
New cards

What are the three key components needed for clustering?

Proximity measure, criterion function, and an algorithm to compute clustering.

3
New cards

What historic application of clustering is mentioned in the notes?

John Snow's mapping of cholera deaths in the 1850s during an outbreak.

4
New cards

What does K-means clustering involve?

K-means clustering partitions data into k clusters with each having a centroid.

5
New cards

How does the K-means algorithm compute clusters?

It chooses initial centroids, assigns points to the closest centroid, and re-computes centroids iteratively.

6
New cards

What is the convergence criterion in K-means?

Convergence is reached when there are no re-assignments of points to different clusters or minimal change in centroids.

7
New cards

What are some strengths of K-means clustering?

It is simple, efficient, and has a time complexity of O(tkn), where n is the number of data points, k is the number of clusters, and t is the number of iterations.

8
New cards

What are weaknesses of the K-means algorithm?

It requires pre-specifying k, is sensitive to outliers, and is not applicable to categorical data without modifications.

9
New cards

What are the two types of hierarchical clustering mentioned?

Divisive (top-down) and agglomerative (bottom-up) clustering.

10
New cards

How is agglomerative hierarchical clustering performed?

It merges the two nearest clusters iteratively until all are combined into a single cluster.