Machine Learning Part 2

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/178

There's no tags or description

Looks like no tags are added yet.

Last updated 1:09 AM on 3/31/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

179 Terms

New cards

Support Vector Machine (SVM)

A supervised learning model that finds a decision boundary maximizing the margin between classes

New cards

Margin

Distance between decision boundary and closest data points

New cards

Support vectors

Points closest to the decision boundary that determine the margin

New cards

Linear separability

Data can be separated by a linear boundary

New cards

Convex optimization

Optimization where any local minimum is global

New cards

Hard margin SVM

SVM with no tolerance for misclassification

New cards

Soft margin SVM

SVM allowing violations using slack variables

New cards

Slack variable (ξᵢ)

Measures margin violation for a data point

New cards

Regularization parameter (C)

Controls tradeoff between margin size and violations

New cards

Large C

Low tolerance for errors, smaller margin

New cards

Small C

Higher tolerance for errors, larger margin

New cards

Hard margin objective

min (1/2)||w||²

New cards

Hard margin constraint

yᵢ(wᵀxᵢ) ≥ 1

New cards

Soft margin objective

min (1/2)||w||² + (C/N)Σξᵢ

New cards

Soft margin constraint

yᵢ(wᵀxᵢ) ≥ 1 − ξᵢ and ξᵢ ≥ 0

New cards

Hinge loss

max(0, 1 − yᵢwᵀxᵢ), penalizes points inside margin or misclassified

New cards

SVM loss

(1/2)||w||² + (C/N)Σmax(0, 1 − yᵢwᵀxᵢ)

New cards

L2 regularization

Penalty discouraging large weights

New cards

Logistic loss

log(1 + e^{−y wᵀx})

New cards

Hinge vs logistic loss

Hinge is piecewise linear, logistic is smooth

New cards

Feature mapping (ϕ(x))

Transforms data to higher dimension

New cards

Kernel function

K(x,z) = ϕ(x)ᵀϕ(z), computes inner product in feature space without explicit mapping

New cards

Kernel trick

Uses kernel without explicit mapping

New cards

Gaussian kernel

exp(−||x−y||² / (2σ²))

New cards

Polynomial kernel

(xᵀy + c)^d

New cards

Dual form SVM

Optimization in terms of α variables

New cards

Dual solution

w = Σαᵢ yᵢ xᵢ (for linear SVM after solving dual problem)

New cards

Why kernels matter

Allow nonlinear boundaries efficiently

New cards

Parametric model

Model with fixed number of parameters independent of dataset size

New cards

Nonparametric model

Model whose complexity grows with dataset size

New cards

Examples of parametric models

Linear regression, logistic regression, SVM

New cards

Examples of nonparametric models

KNN, decision trees

New cards

K-Nearest Neighbors (KNN)

Model that predicts using nearby data points

New cards

KNN training

Stores all training data

New cards

KNN prediction

Uses majority vote (classification) or average (regression)

New cards

KNN classification formula

ŷ(x) = sign(Σ{i ∈ Nk(x)} yᵢ)

New cards

KNN regression formula

f(x) = (1/k)Σyᵢ

New cards

KNN hyperparameter

K (number of neighbors)

New cards

Small K

Flexible, high variance, overfitting

New cards

Large K

Smooth, high bias, underfitting

New cards

Distance metric

Measure of similarity between points

New cards

Euclidean distance

||x − x′||₂ = √(Σ (xᵢ − x′ᵢ)²)

New cards

Hamming distance

Counts mismatched entries

New cards

Jaccard distance

1 − (|A ∩ B| / |A ∪ B|)

New cards

Edit distance

Minimum edits to transform one string to another

New cards

Weighted KNN

Neighbors weighted by distance

New cards

Kernel regression

Generalization of weighted KNN

New cards

Decision tree

Model that splits data using feature-based rules

New cards

Internal node

Test on a feature

New cards

Branch

Outcome of a test

New cards

Leaf node

Final prediction

New cards

Decision tree property

Interpretable and nonlinear

New cards

CART

Classification and Regression Trees algorithm

New cards

Tree building strategy

Greedy top-down splitting

New cards

Node impurity

Measure of how mixed labels are

New cards

Gini impurity

1 − Σ pᵢ², measures probability of misclassification in a node

New cards

Gini range

0 (pure) to 0.5 (max impurity in binary)

New cards

Best split

Minimizes average Gini impurity

New cards

Tree pruning

Reducing size to prevent overfitting

New cards

Stopping conditions

Pure node, no features, or too few samples

New cards

Decision tree hyperparameters

Depth, min samples, etc.

New cards

Discriminative model

Learns P(y|x)

New cards

Generative model

Learns P(x,y)

New cards

Key difference

Generative can create data, discriminative cannot

New cards

Examples discriminative

Logistic regression, SVM

New cards

Examples generative

Naive Bayes, GMM

New cards

Naive Bayes

Generative classifier using Bayes rule

New cards

Bayes rule

P(A|B) = P(B|A)P(A) / P(B)

New cards

Naive assumption

Features are conditionally independent

New cards

Naive Bayes formula

P(y|x) ∝ P(y) Π P(xᵢ|y), denominator P(x) is ignored since it is constant

New cards

Advantage of NB

Simple and efficient

New cards

Limitation of NB

Independence assumption unrealistic

New cards

Supervised learning

Uses labeled data (x,y)

New cards

Unsupervised learning

Uses unlabeled data (x only)

New cards

Clustering

Grouping similar data points

New cards

K-means clustering

Partitions data into K clusters

New cards

K-means objective

Minimize Σ min_k ||xᵢ − μ_k||² (assign each point to nearest cluster center)

New cards

Cluster center (μₖ)

Mean of points in cluster

New cards

Assignment step

Assign to nearest center

New cards

Update step

Recompute centers

New cards

K-means limitation

Not convex, may find local minimum

New cards

Initialization sensitivity

Different starts give different results

New cards

K-means++

Better initialization strategy

New cards

Choosing K

Use elbow method

New cards

Elbow method

Find point where error reduction slows

New cards

Kernel K-means

Uses kernel trick for nonlinear clustering

New cards

Non-spherical clusters

Problem for standard K-means

New cards

Dimensionality reduction

Reduce number of features

New cards

Principal Component Analysis (PCA)

Find directions of max variance

New cards

Principal component

Direction capturing most variance

New cards

Covariance matrix

Measures feature relationships

New cards

First principal component

Eigenvector with largest eigenvalue

New cards

PCA objective

Maximize variance of projections subject to ||v|| = 1

New cards

Projection

Mapping data onto lower dimension

New cards

Orthogonality in PCA

Components are perpendicular

New cards

Eigenvalue meaning

Amount of variance captured

New cards

Eigenvector meaning

Direction of variance

New cards

Top-K PCA

Use top K eigenvectors

New cards

Purpose of PCA

Compression and noise reduction

100

New cards

w (weight vector)

A vector of parameters that defines the model decision boundary

Explore top notes

Chp 7: Listening Skills

Updated 1180d ago

0.0(0)

Chapter 11: Atoms and Radioactivity

Updated 1216d ago

0.0(0)

Landforms

Updated 1227d ago

0.0(0)

Abortion

Updated 1407d ago

0.0(0)

Chapter 9~ Civil Rights

Updated 1069d ago

0.0(0)

Federal Board Chemistry Class 9th PDF Book - MDCAT BY FUTURE DOCTORS - TOUSEEF AHMAD

Updated 407d ago

0.0(0)

CHAPTER 1 & 2a

Updated 1164d ago

0.0(0)

Chemistry Chapter 7 and 20 Vocabulary

Updated 1233d ago

0.0(0)

Chp 7: Listening Skills

Updated 1180d ago

0.0(0)

Chapter 11: Atoms and Radioactivity

Updated 1216d ago

0.0(0)

Landforms

Updated 1227d ago

0.0(0)

Abortion

Updated 1407d ago

0.0(0)

Chapter 9~ Civil Rights

Updated 1069d ago

0.0(0)

Federal Board Chemistry Class 9th PDF Book - MDCAT BY FUTURE DOCTORS - TOUSEEF AHMAD

Updated 407d ago

0.0(0)

CHAPTER 1 & 2a

Updated 1164d ago

0.0(0)

Chemistry Chapter 7 and 20 Vocabulary

Updated 1233d ago

0.0(0)

Explore top flashcards

AP ENG LIT Root words Semester 1

27Updated 1204d ago

0.0(0)

MRE

107Updated 1193d ago

0.0(0)

TX-20173rd 2025-2026 Knowledge

50Updated 77d ago

0.0(0)

Born a Crime chapter 4-8

41Updated 1058d ago

0.0(0)

Patient Care Unit 1-3

82Updated 950d ago

0.0(0)

finals

66Updated 1058d ago

0.0(0)

ENGLISH EXAM STUDY GUIDE

95Updated 1050d ago

0.0(0)

Psyc 108 UCSD Midterm 2 Study Guide

79Updated 665d ago

0.0(0)

AP ENG LIT Root words Semester 1

27Updated 1204d ago

0.0(0)

MRE

107Updated 1193d ago

0.0(0)

TX-20173rd 2025-2026 Knowledge

50Updated 77d ago

0.0(0)

Born a Crime chapter 4-8

41Updated 1058d ago

0.0(0)

Patient Care Unit 1-3

82Updated 950d ago

0.0(0)

finals

66Updated 1058d ago

0.0(0)

ENGLISH EXAM STUDY GUIDE

95Updated 1050d ago

0.0(0)

Psyc 108 UCSD Midterm 2 Study Guide

79Updated 665d ago

0.0(0)