Clustering

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/16

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

17 Terms

1
New cards

PCA and Clustering similarities

both trying to simplify the data

2
New cards

classification

group data into specific categories using labeled dataset

3
New cards

clustering

group similar data instances together using unlabeled dataset, clusters are not necessarily aligned with classification

4
New cards

PCA vs clustering

pca maximizes the variance of PC scores

5
New cards

clustering

minimizes within cluster variance

6
New cards

PCA

maximize the variance of PC scores

7
New cards

Euclidean distance

used for continuous data

8
New cards

mahalanobis

adjusts for variable correlations

9
New cards

k means clustering

minimize within cluster variation

10
New cards

centroid

the vector the p feature means for the observation

11
New cards

k mediods

choose data points as the centers (mediods)

12
New cards

mediods

most centrally located point in the cluster

13
New cards

k modes clustering optimal k

have a scree plot of dissimilarities and find the elbow point.

14
New cards

k-modes procedure

count dissimilarities

15
New cards

k- prototype

designed to handle mixed datasets (both numerical and categorical) combines k-means and k-modes

16
New cards

Silhouette

measures how similar an object is to its own cluster compared to other clusters.

17
New cards

silhouette range

-1 to +1