COMPSCI 178 - Classifiers

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/27

flashcard set

Earn XP

Description and Tags

Explain the usage of the algorithm and the process

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

28 Terms

1
New cards

Nearest Centroid Classifier

  • Definition

    • The Nearest Centroid Classifier assigns labels to data points based on the label of the closest centroid, where each centroid represents the mean of all points in a given class.

  • Related Equations:

    • \mathbf{c}_k = \frac{1}{N_k} \sum_{i=1}^{N_k} \mathbf{x}_i

      • where c_k ​ is the centroid of class k , N_k ​ is the number of samples in class K, and x_i are the data points in class k .

    • d(\mathbf{x}, \mathbf{c}_k) = \|\mathbf{x} - \mathbf{c}_k\|

      • d(\mathbf{x}, \mathbf{c}_k) is the distance between a data point x and the centroid {c}_k and \|\mathbf{x} - \mathbf{c}_k\| denotes the Euclidean distance

2
New cards

Nearest Centroid Classifier - Complexity

  • Time

    • Training: O(n \cdot d), n is the number of training samples and d is the number of features

    • Prediction: O(k \cdot d), where k is the number of classes

  • Space

    • O(k \cdot d)

3
New cards

kNN Classifier

  • Definition

    • The k-Nearest Neighbors (kNN) classifier assigns labels to data points based on the majority label of the k closest training points.

  • Related Equations

    • d(\mathbf{x}, \mathbf{c}_k) = \|\mathbf{x} - \mathbf{c}_k\|

      • d(\mathbf{x}, \mathbf{c}_k) is the distance between a data point x and the centroid {c}_k and \|\mathbf{x} - \mathbf{c}_k\| denotes the Euclidean distance

4
New cards

kNN Classifier - Complexity

  • Time

    • Prediction: O(n \cdot d), where k is the number of classes

  • Space

    • O(n \cdot d + 1)

5
New cards

Perceptron Classifier

  • Definition

    • Type of linear classifier that separates data points into two classes using a hyperplane.

    • It updates its weight vector iteratively by adjusting the weights in the direction that reduces classification errors, based on the difference between the predicted and actual labels.

  • Related Equations

    • Loss Functions

      • MSE L(\theta) = \frac{1}{n} \sum_{i=1}^{n} \left(y_i - f(x_i \mid \theta)\right)^2

    • Gradient Descent

6
New cards

Perceptron Classifier - Complexity

  • Time

    • Training: O(t \cdot n \cdot d)

      • t is number of iterations

      • n is number of training samples

      • d is number of features

    • Prediction: O(d)

  • Space

    • O(d)for storing weight vector and bias term

7
New cards

Logistic Classifier

  • Definition

    • Linear model used for binary classification that models the probability of a data point belonging to a particular class using the logistic function.

  • Related Equations:

    • Sigmoid Function

      • \sigma(z) = \frac{1}{1 + e^{-z}}

    • Gradient Descent

8
New cards

Logistic Classifier - Complexity

  • Time

    • Training: O(t \cdot n \cdot d)

      • t is number of iterations

      • n is number of training samples

      • d is number of features

    • Prediction: O(d)

  • Space

    • O(d)for storing weight vector and bias term

9
New cards

Neural Network w/ single hidden layer

  • Definition

    • A neural network with a single hidden layer consists of an input layer, one hidden layer of neurons, and an output layer, where each layer performs a weighted sum followed by a non-linear activation function to model complex relationships in the data.

  • Related Equations

    • Hidden Layer Activation

      • \mathbf{z}^{(1)} = \mathbf{W}^{(1)} \mathbf{x} + \mathbf{b}^{(1)}

        • W = weights

        • x = input vector

        • b = bias

10
New cards

Neural Network w/ single hidden layer - Complexity

  • Time

    • Training O(t \cdot n \cdot d \cdot h)

      • where t is the number of iterations,

      • n is the number of training samples,

      • d is the number of input features,

      • h is the number of neurons in the hidden layer.

    • O(d⋅h+h⋅o)

      • o is output neurons

  • Space

    • O(d⋅h+h⋅o)

11
New cards

Decision Trees

  • Definition

    • Tree-like model of decisions, splitting data points at each node based on feature values to classify them into different categories.

  • Related Equations

    • Gini Index \text{Gini}(D) = 1 - \sum_{i=1}^{c} p_i^2

12
New cards

Decision Trees - Complexity

  • Space: O(n⋅m)

13
New cards

Bagging

  • Definition

    • Ensemble method that improves the stability and accuracy of machine learning algorithms by training multiple models on different subsets of the training data and averaging their predictions.

14
New cards

Bagging - Complexity

  • Space O(B⋅s)

    • B is the number of models

    • s is the space complexity of storing a single model.

15
New cards

Random Forest

  • Definition

    • Ensemble learning method that constructs multiple decision trees during training and outputs the mode of the classes (classification) or mean prediction (regression) of the individual trees.

16
New cards

Random Forest - Complexity

  • Space O(B⋅s)

    • B is the number of trees

    • s is the space complexity of storing a single tree

17
New cards

Boosting

  • Definition

    • Ensemble method that sequentially trains weak learners, each focusing on the errors of the previous ones, and combines their predictions to create a strong overall model.

18
New cards

Boosting - Complexity

  • Space O(T⋅s)

    • T number of iterations

    • s space complexity of storing learner

19
New cards

Linear Regression

  • Definition

    • Machine learning model that predicts a continuous target variable by fitting a linear relationship between the input features and the target.

  • Related Equations

    • Linear model \hat{y} = \mathbf{w} \cdot \mathbf{x} + b

    • MSE L(\mathbf{w}, b) = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2

    • Gradient Descent

20
New cards

Linear Regression - Complexity

  • Time

    • Training: O(t \cdot n \cdot d)

      • t is number of iterations

      • n is number of training samples

      • d is number of features

    • Prediction: O(d)

  • Space

    • O(d)for storing weight vector and bias term

21
New cards

Polynomial Regression

  • Definition

    • Type of linear regression that models the relationship between the independent variable and the dependent variable as an n-th degree polynomial.

  • Related Equations

    • Polynomial Model: \hat{y} = w_0 + w_1 x + w_2 x^2 + \ldots + w_n x^n

    • MSE L(\mathbf{w}, b) = \frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2

    • Gradient Descent

22
New cards

Polynomial Regression - Complexity

  • Time

    • Training: O(n \cdot d \cdot p)

      • pis polynomial degree

      • n is number of training samples

      • d is number of features

    • Prediction: O(d \cdot p)

  • Space

    • O(d \cdot p)for storing weight vector and bias term

23
New cards

K-Means Clustering

  • Definition

    • unsupervised learning algorithm that partitions a dataset into k clusters, where each data point belongs to the cluster with the nearest mean.

  • Related Equations

    • Cluster assignment\mathbf{\mu}_i = \frac{1}{|C_i|} \sum_{\mathbf{x}_j \in C_i} \mathbf{x}_j

24
New cards

K-Means Clustering - Complexity

  • Time

    • Training O(t⋅k⋅n⋅d)

      • t is the number of iterations

      • k is the number of clusters

      • n is the number of samples

      • d number of features

    • Prediction O(k⋅n⋅d)

  • Space

    • O(k⋅d+n⋅d)

25
New cards

Hierarchical Clustering

  • Definition

    • unsupervised learning algorithm that builds a tree of clusters by recursively merging or splitting existing clusters based on their similarity.

  • Related Equations

    • Euclidean

    • Manhattan d(\mathbf{x}_i, \mathbf{x}_j) = \sum_{k=1}^{d} |x_{ik} - x_{jk}|

26
New cards

Agglomerative Clustering

  • Definition

    • bottom-up hierarchical clustering method that starts with each data point as its own cluster and iteratively merges the closest pairs of clusters until only one cluster remains.

  • Related Equations

    • Euclidean, manhattan

    • Single Linkage, Complete Linkage, Average Linkage

27
New cards

Hierarchical Agglomerative Clustering

  • Definition

    • bottom-up clustering method that starts with each data point as its own cluster and iteratively merges the closest clusters until only one cluster remains or a stopping criterion is met.

28
New cards