Cluster analysis

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/8

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 3:48 PM on 5/19/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

9 Terms

1
New cards

Cluster analysis

A statistical method used to group similar cases (people, countries) together based on multiple variables. The goal is to identify naturally occuring patterns in the data without pre-defined categories.

Instead of testing relationships (like regression does), cluster analysis asks: “Which cases are similar enough to belong together?”

If the data is skewed or contains extreme values, clusters may be distorted.

2
New cards

Variable-oriented analysis

Focuses on relationship between variables across the whole sample.

3
New cards

Person-oriented analysis

Focuses on patterns within individuals, grouping them based on how variables combine. Gives a broader, more holistic understanding of individuals.

4
New cards

Hierarhical clustering

Step by step method where each case starts with its own cluster and gradually merges the most similar clusters, continues until all cases are in one Cluster.

5
New cards

Malahanobis distance

Measures how far a case is from the center of all variables, while considering relationships between the variables.

6
New cards

Silhouette score

Measures how well each case fits into its assigned cluster

  • Close to +1 - very well matched

  • Around 0 - Unclear

  • Negative - Likely in the wrong cluster

7
New cards

Wards method

Clustering method that groups data step by step by always choosing the merge that keeps clusters similar (low variance as possible)

8
New cards

Squared Euclidean distance

Measures how far apart two points are by summing the squared differences between their corresponding values. It is commonly used as a distance measure in hierarchical cluster analysis to assess similarity between cases.

9
New cards

Agglomeration coefficent

Used to determine the optimal number of clusters by evaluating the increase in disimiliarity within the clusters.