1/27
Explain the usage of the algorithm and the process
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Nearest Centroid Classifier
Definition
The Nearest Centroid Classifier assigns labels to data points based on the label of the closest centroid, where each centroid represents the mean of all points in a given class.
Related Equations:
\mathbf{c}_k = \frac{1}{N_k} \sum_{i=1}^{N_k} \mathbf{x}_i
where c_k is the centroid of class k , N_k is the number of samples in class K, and x_i are the data points in class k .
d(\mathbf{x}, \mathbf{c}_k) = \|\mathbf{x} - \mathbf{c}_k\|
d(\mathbf{x}, \mathbf{c}_k) is the distance between a data point x and the centroid {c}_k and \|\mathbf{x} - \mathbf{c}_k\| denotes the Euclidean distance
Nearest Centroid Classifier - Complexity
Time
Training: O(n \cdot d), n is the number of training samples and d is the number of features
Prediction: O(k \cdot d), where k is the number of classes
Space
O(k \cdot d)
kNN Classifier
Definition
The k-Nearest Neighbors (kNN) classifier assigns labels to data points based on the majority label of the k closest training points.
Related Equations
d(\mathbf{x}, \mathbf{c}_k) = \|\mathbf{x} - \mathbf{c}_k\|
d(\mathbf{x}, \mathbf{c}_k) is the distance between a data point x and the centroid {c}_k and \|\mathbf{x} - \mathbf{c}_k\| denotes the Euclidean distance
kNN Classifier - Complexity
Time
Prediction: O(n \cdot d), where k is the number of classes
Space
O(n \cdot d + 1)
Perceptron Classifier
Definition
Type of linear classifier that separates data points into two classes using a hyperplane.
It updates its weight vector iteratively by adjusting the weights in the direction that reduces classification errors, based on the difference between the predicted and actual labels.
Related Equations
Loss Functions
MSE L(\theta) = \frac{1}{n} \sum_{i=1}^{n} \left(y_i - f(x_i \mid \theta)\right)^2
Gradient Descent
Perceptron Classifier - Complexity
Time
Training: O(t \cdot n \cdot d)
t is number of iterations
n is number of training samples
d is number of features
Prediction: O(d)
Space
O(d)for storing weight vector and bias term
Logistic Classifier
Definition
Linear model used for binary classification that models the probability of a data point belonging to a particular class using the logistic function.
Related Equations:
Sigmoid Function
\sigma(z) = \frac{1}{1 + e^{-z}}
Gradient Descent
Logistic Classifier - Complexity
Time
Training: O(t \cdot n \cdot d)
t is number of iterations
n is number of training samples
d is number of features
Prediction: O(d)
Space
O(d)for storing weight vector and bias term
Neural Network w/ single hidden layer
Definition
A neural network with a single hidden layer consists of an input layer, one hidden layer of neurons, and an output layer, where each layer performs a weighted sum followed by a non-linear activation function to model complex relationships in the data.
Related Equations
Hidden Layer Activation
\mathbf{z}^{(1)} = \mathbf{W}^{(1)} \mathbf{x} + \mathbf{b}^{(1)}
W = weights
x = input vector
b = bias
Neural Network w/ single hidden layer - Complexity
Time
Training O(t \cdot n \cdot d \cdot h)
where t is the number of iterations,
n is the number of training samples,
d is the number of input features,
h is the number of neurons in the hidden layer.
O(d⋅h+h⋅o)
o is output neurons
Space
O(d⋅h+h⋅o)
Decision Trees
Definition
Tree-like model of decisions, splitting data points at each node based on feature values to classify them into different categories.
Related Equations
Gini Index \text{Gini}(D) = 1 - \sum_{i=1}^{c} p_i^2
Decision Trees - Complexity
Space: O(n⋅m)
Bagging
Definition
Ensemble method that improves the stability and accuracy of machine learning algorithms by training multiple models on different subsets of the training data and averaging their predictions.
Bagging - Complexity
Space O(B⋅s)
B is the number of models
s is the space complexity of storing a single model.
Random Forest
Definition
Ensemble learning method that constructs multiple decision trees during training and outputs the mode of the classes (classification) or mean prediction (regression) of the individual trees.
Random Forest - Complexity
Space O(B⋅s)
B is the number of trees
s is the space complexity of storing a single tree
Boosting
Definition
Ensemble method that sequentially trains weak learners, each focusing on the errors of the previous ones, and combines their predictions to create a strong overall model.
Boosting - Complexity
Space O(T⋅s)
T number of iterations
s space complexity of storing learner
Linear Regression
Definition
Machine learning model that predicts a continuous target variable by fitting a linear relationship between the input features and the target.
Related Equations
Linear model \hat{y} = \mathbf{w} \cdot \mathbf{x} + b
MSE L(\mathbf{w}, b) = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
Gradient Descent
Linear Regression - Complexity
Time
Training: O(t \cdot n \cdot d)
t is number of iterations
n is number of training samples
d is number of features
Prediction: O(d)
Space
O(d)for storing weight vector and bias term
Polynomial Regression
Definition
Type of linear regression that models the relationship between the independent variable and the dependent variable as an n-th degree polynomial.
Related Equations
Polynomial Model: \hat{y} = w_0 + w_1 x + w_2 x^2 + \ldots + w_n x^n
MSE L(\mathbf{w}, b) = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2
Gradient Descent
Polynomial Regression - Complexity
Time
Training: O(n \cdot d \cdot p)
pis polynomial degree
n is number of training samples
d is number of features
Prediction: O(d \cdot p)
Space
O(d \cdot p)for storing weight vector and bias term
K-Means Clustering
Definition
unsupervised learning algorithm that partitions a dataset into k clusters, where each data point belongs to the cluster with the nearest mean.
Related Equations
Cluster assignment\mathbf{\mu}_i = \frac{1}{|C_i|} \sum_{\mathbf{x}_j \in C_i} \mathbf{x}_j
K-Means Clustering - Complexity
Time
Training O(t⋅k⋅n⋅d)
t is the number of iterations
k is the number of clusters
n is the number of samples
d number of features
Prediction O(k⋅n⋅d)
Space
O(k⋅d+n⋅d)
Hierarchical Clustering
Definition
unsupervised learning algorithm that builds a tree of clusters by recursively merging or splitting existing clusters based on their similarity.
Related Equations
Euclidean
Manhattan d(\mathbf{x}_i, \mathbf{x}_j) = \sum_{k=1}^{d} |x_{ik} - x_{jk}|
Agglomerative Clustering
Definition
bottom-up hierarchical clustering method that starts with each data point as its own cluster and iteratively merges the closest pairs of clusters until only one cluster remains.
Related Equations
Euclidean, manhattan
Single Linkage, Complete Linkage, Average Linkage
Hierarchical Agglomerative Clustering
Definition
bottom-up clustering method that starts with each data point as its own cluster and iteratively merges the closest clusters until only one cluster remains or a stopping criterion is met.