PCA and Unsupervised Learning Flashcards

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/37

flashcard set

Earn XP

Description and Tags

Mark lecture 1

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

38 Terms

1
New cards

What is unsupervised learning?

Learning from data without labels, where algorithms find patterns, structures, or relationships on their own.

2
New cards

What are the main types of unsupervised learning?

Clustering (grouping similar items), dimensionality reduction (simplifying data), and anomaly detection (finding outliers).

3
New cards

How does unsupervised learning differ from supervised learning?

Supervised learning uses labeled data to make predictions, while unsupervised learning discovers patterns without any predefined answers.

4
New cards

What is Principal Component Analysis (PCA)?

A technique that reduces data dimensions by finding new variables (principal components) that capture the most variation in the data.

5
New cards

What problem does PCA solve?

It simplifies complex data with many variables by creating fewer new variables that still preserve most of the important information.

6
New cards

What are principal components?

New variables created by combining the original variables, ordered by how much data variation they capture.

7
New cards

What is variance in statistics?

A measure of how spread out data points are from their average value.

8
New cards

What is covariance?

A measure of how two variables change together - positive when they move in the same direction, negative when they move in opposite directions.

9
New cards

What is correlation?

A scaled version of covariance that ranges from -1 to +1, making it easier to interpret how strongly variables are related.

10
New cards

How are correlation and covariance related?

Correlation equals covariance divided by the product of standard deviations, putting the relationship on a standard scale (-1 to +1).

11
New cards

What's the first step in performing PCA?

Standardize the data by centering variables at zero and scaling them to have equal variance.

12
New cards

How are principal components ordered?

By the amount of variance they explain - the first component explains the most variance, the second explains the second most, and so on.

13
New cards

What do the weights in a principal component tell us?

They show how much each original variable contributes to that component, with larger values (positive or negative) indicating stronger influence.

14
New cards

How many principal components should you keep?

Enough to explain a sufficient amount of variance (often 80-90%) or by looking for an "elbow" in the scree plot.

15
New cards

How can PCA be used for data visualization?

By reducing data to 2 or 3 dimensions, allowing us to plot and visually explore relationships in originally high-dimensional data.

16
New cards

How does PCA help with data compression?

It represents data using fewer variables (components) while preserving most of the important information.

17
New cards

What was revealed in the drug use study example using PCA?

It found that legal substances had positive weights while illegal substances had negative weights, revealing two distinct patterns of student behavior.

18
New cards

How can PCA improve machine learning models?

By removing noise, reducing overfitting, speeding up training, and breaking down correlations between features.

19
New cards

What is Hebbian learning?

A neural learning rule stating "neurons that fire together, wire together," which provides an algorithmic approach to find correlational patterns.

20
New cards

What is the basic Hebbian update rule?

Change in weight equals learning rate times input activation times output activation (Δw = η × x × y).

21
New cards

How is Hebbian learning related to PCA?

With linear activation and proper normalization, Hebbian learning can implement PCA, finding the same principal components.

22
New cards

What's the main limitation of basic Hebbian learning?

It can only find the first principal component (direction of maximum variance) without additional modifications.

23
New cards

What is Sequential PCA (SPCA)?

A technique that uses multiple output units to learn multiple principal components in sequence, from most to least important.

24
New cards

How does Sanger's rule differ from basic Hebbian learning?

It includes an extra term that subtracts the influence of previously learned components, allowing each unit to learn a new component.

25
New cards

Why would you need multiple principal components?

One component usually isn't enough to capture all important variation in the data; multiple components provide a more complete picture.

26
New cards

How can multiple principal components be used in image compression?

By representing images using just the top components, significantly reducing file size while maintaining most visual information.

27
New cards

What is t-SNE?

A more advanced technique for visualizing high-dimensional data that preserves local structure better than PCA.

28
New cards

How does t-SNE differ from PCA?

t-SNE is non-linear and focuses on keeping similar points close together, while PCA is linear and focuses on maximum variance.

29
New cards

When would you use t-SNE instead of PCA?

When you care more about seeing clear clusters and local relationships than preserving global structure or exact distances.

30
New cards

How can t-SNE help visualize image data?

It can place similar images near each other in a 2D map, making it easy to see patterns and relationships in large image collections.

31
New cards

How can PCA help with clustering?

By reducing dimensions, removing noise, and making distance calculations more meaningful, which often leads to better cluster separation.

32
New cards

How can PCA provide insight into cluster formation?

It can reveal the underlying factors that explain why certain data points cluster together.

33
New cards

Why use both PCA and clustering together?

PCA simplifies the data while preserving important information, and clustering then groups similar data points based on these simplified features.

34
New cards

What can you learn by coloring PCA plots by known categories?

You can see if the principal components naturally separate the categories, indicating they've captured meaningful variations.

35
New cards

Why is data standardization important before PCA?

Without standardization, variables with larger scales will dominate the principal components regardless of their actual importance.

36
New cards

What is a scree plot in PCA?

A graph showing the variance explained by each principal component, used to decide how many components to keep.

37
New cards

What are PCA's limitations?

It only captures linear relationships, can be difficult to interpret, and is sensitive to outliers.

38
New cards

How can you determine if PCA is capturing useful information?

By examining how much variance is explained, if components reveal meaningful patterns, and if they improve downstream tasks like classification.