Unsupervised Machine Learning

0.0(0)

Studied by 0 people

0.0(0)

Call with Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/19

Earn XP

Description and Tags

A set of flashcards covering key terms and concepts related to Unsupervised Machine Learning.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No study sessions yet.

20 Terms

New cards

Unsupervised Learning

A type of machine learning where the model works without labeled data to discover patterns or structures in the data.

New cards

Clustering

The process of grouping a set of objects such that objects in the same group are more similar to each other than to those in other groups.

New cards

Dimensionality Reduction

The process of reducing the number of random variables under consideration, obtaining a set of principal variables.

New cards

Association Rule Learning

A method used to discover interesting relations between variables in large databases, commonly used in market basket analysis.

New cards

Anomaly Detection

The identification of rare items or events which raise suspicions by differing significantly from the majority of the data.

New cards

K-Means Clustering

A clustering method that partitions data into K clusters by minimizing the distance between data points and the cluster centroids.

New cards

PCA (Principal Component Analysis)

A technique to reduce the dimensionality of data while preserving as much variance as possible.

New cards

Silhouette Score

A measure used to evaluate how well each object lies within its cluster, calculated as the difference between a point's distance to its own cluster and to the nearest cluster.

New cards

DBSCAN (Density-Based Spatial Clustering of Applications with Noise)

A clustering algorithm that groups closely packed points together and marks points in low-density regions as outliers.

New cards

Eigenvectors and Eigenvalues

Concepts used in PCA; eigenvalues indicate the variance explained by an eigenvector, which represents principal component directions.

New cards

Covariance Matrix

A matrix that captures how several variables vary together and serves as a key component in PCA.

New cards

Elbow Method

A technique to choose the number of clusters (K) in K-Means by determining the point where the increase in K starts to yield diminishing returns.

New cards

Mean Vector

The vector containing means of all dimensions in a dataset, calculated by averaging the data points.

New cards

Inertia

Also known as the Within-Cluster Sum of Squares (WCSS), it measures how tightly grouped the members of a cluster are.

New cards

Feature Scaling

The method of normalizing data features to ensure that each feature contributes equally to the distance calculations.

New cards

K-Means++

An improved initialization technique for K-Means that selects initial centroids to avoid poor clustering and enhance convergence speed.

New cards

Curse of Dimensionality

A phenomenon where the feature space becomes increasingly sparse due to the exponential increase in volume associated with adding dimensions.

New cards

t-SNE (t-Distributed Stochastic Neighbor Embedding)

A nonlinear dimensionality reduction technique particularly suited for visualizing high-dimensional data.

New cards

UMAP (Uniform Manifold Approximation and Projection)

A modern technique for dimensionality reduction and visualization that preserves the structure of complex data.

New cards

Gaussian Mixture Models (GMM)

A probabilistic model that assumes data points are generated from a mixture of several Gaussian distributions.