Clustering Algorithms - Key Vocabulary

0.0(0)

Studied by 0 people

View linked note

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/13

Earn XP

Description and Tags

Vocabulary flashcards covering key clustering concepts and algorithms from the notes.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

14 Terms

New cards

K-Means Clustering

A partitioning unsupervised algorithm that divides data into K non-overlapping clusters by minimizing the within-cluster sum of squares (inertia).

New cards

Inertia (Within-cluster Sum of Squares)

The sum of squared distances between data points and their cluster centroids; K-Means aims to minimize this value.

New cards

Hierarchical Clustering

Builds a hierarchy of clusters by recursively merging or splitting clusters; can be agglomerative (bottom-up) or divisive (top-down).

New cards

Agglomerative Clustering

A hierarchical approach that starts with each data point as its own cluster and merges clusters based on a linkage criterion (e.g., single, complete, average).

New cards

DBSCAN

Density-Based Spatial Clustering that groups densely packed points using ε (epsilon) and MinPts; handles noise and discovers clusters of arbitrary shape without predefining the number of clusters.

New cards

ε (epsilon) in DBSCAN

Maximum distance between two points for them to be considered neighbors.

New cards

MinPts

Minimum number of points required in a point's ε-neighborhood to form a dense region.

New cards

Core Point

A point with at least MinPts points within its ε-neighborhood (including itself).

New cards

Border Point

A point within the ε-neighborhood of a core point but not itself a core point.

New cards

Noise Point (Outlier)

A point that is neither a core point nor a border point and is not assigned to a cluster.

New cards

OPTICS

Ordering Points To Identify the Clustering Structure; extension of DBSCAN producing a hierarchical clustering structure and robustness to varying densities.

New cards

Mean Shift Clustering

Non-parametric algorithm that shifts centroids toward areas of higher data density to identify clusters.

New cards

Gaussian Mixture Models (GMM)

Model-based clustering assuming data are generated from a mixture of Gaussian distributions; estimates parameters to identify clusters.

New cards

Spectral Clustering

Graph-based clustering technique that uses the eigenvalues of a similarity matrix to partition data into clusters.