PCA (Principal Component Analysis) - Video Notes Vocabulary Flashcards

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/14

flashcard set

Earn XP

Description and Tags

Vocabulary flashcards covering key PCA concepts from the lecture notes, including dimensionality reduction, PCA components, eigenvalues/eigenvectors, standardization, covariance, scores, and practical examples.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

15 Terms

1
New cards

Principal Component Analysis (PCA)

A dimensionality reduction technique that transforms a set of possibly correlated variables into a smaller set of uncorrelated variables called principal components, capturing most of the data's variance.

2
New cards

Dimensionality reduction

The process of reducing the number of random variables under consideration, typically by obtaining a smaller set of principal variables that retain most of the information.

3
New cards

Unsupervised data mining technique

A data mining method that does not use labeled outcomes; PCA is unsupervised and focuses on capturing structure/variance in the data.

4
New cards

Principal Component (PC)

A linear combination of original variables that explains a portion of the total variance; PCs are ordered by explained variance (PC1, PC2, …), and are uncorrelated.

5
New cards

Eigenvalue

A scalar indicating how much variance is captured by its corresponding eigenvector in PCA; used to rank principal components.

6
New cards

Eigenvector

A weight vector that defines the direction of maximum variance for a principal component; columns form the eigenvectors matrix.

7
New cards

Loadings

The contributions of the original variables to a principal component; elements of an eigenvector showing how much each variable contributes.

8
New cards

Variance explained (explainedvarianceratio_)

The proportion of total variance explained by a given principal component (e.g., PC1 explains 46.62%).

9
New cards

Cumulative variance

The running total of explained variance across principal components; indicates how many components are needed to reach a desired information threshold.

10
New cards

Uncorrelated (orthogonal) PCs

Principal components are constructed to be uncorrelated with each other, meaning their pairwise covariances are zero.

11
New cards

Standardization (Z-score) before PCA

Scaling variables to zero mean and unit variance because PCA is sensitive to the scale of variables.

12
New cards

Covariance (co-variation) matrix

Matrix of covariances between pairs of variables; its eigenvalues/eigenvectors are used to compute principal components.

13
New cards

PC scores

The coordinates of observations in the PC space; computed as a weighted sum of standardized variables using PC weights.

14
New cards

World Bank health data PCA example

An illustrative application where PCA is applied to health indicators across countries to reduce variables and identify top principal components.

15
New cards

How to decide number of PCs to keep

Use explained variance and cumulative variance to choose how many PCs explain a desired portion of information (e.g., 80–95%).