1/15
Flashcards covering key vocabulary and concepts from the lecture on machine learning and clustering.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Unsupervised learning
A type of machine learning where the model learns patterns from unlabelled data without target variables.
Clustering
An unsupervised learning technique that involves partitioning data into distinct groups based on similarity.
Latent variables
Unobserved or hidden variables that can be inferred from observed data and are used to identify structures in a dataset.
k-means algorithm
A clustering method that assigns data points to one of k clusters by minimizing the distances from points to cluster centroids.
Euclidean distance
A commonly used distance metric that measures the straight line distance between two points in Euclidean space.
Centroid
The mean point of a cluster in clustering algorithms, representing the center of that cluster.
Hard clustering
A type of clustering where each data point is assigned to exactly one cluster.
Soft clustering
A type of clustering where a data point can belong to multiple clusters with varying membership degrees.
Expectation-Maximization (EM) algorithm
An iterative method to find maximum likelihood estimates for models with latent variables.
Gaussian mixture model
A probabilistic model that assumes all data points are generated from a mixture of several Gaussian distributions.
Marginal probability
The probability of a single random variable without consideration of other random variables.
Joint probability
The probability of two random variables occurring simultaneously.
Conditional probability
The probability of one event occurring given that another event has occurred.
Bayes' theorem
A mathematical formula that expresses the probability of an event based on prior knowledge of conditions related to the event.
Image segmentation
The process of partitioning an image into multiple segments or regions, often using clustering techniques.
Log-likelihood
A measure of how well a statistical model describes the observed data, usually expressed on a logarithmic scale.