Machine Learning Formulas (Set 3)

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/22

There's no tags or description

Looks like no tags are added yet.

Last updated 3:18 PM on 3/31/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

23 Terms

New cards

hard margin objective

Minimize one half of the squared L2 norm of w; this maximizes the margin by keeping w small.

New cards

hard margin constraint

For every training example i, y sub i times w transpose x sub i is at least one; this enforces correct classification with a margin.

New cards

soft margin objective

Minimize one half of the squared L2 norm of w plus C divided by N times the sum over i of xi sub i; this trades off margin size and violations.

New cards

soft margin constraint

For every training example i, y sub i times w transpose x sub i is at least one minus xi sub i, and xi sub i is at least zero; this allows controlled margin violations.

New cards

hinge loss

Max of zero and one minus y sub i times w transpose x sub i; penalizes points inside the margin or misclassified.

New cards

SVM loss

Minimize one half of the squared L2 norm of w plus C divided by N times the sum over i of max of zero and one minus y sub i times w transpose x sub i; this is regularized hinge-loss minimization.

New cards

logistic loss

Log of one plus e to the negative y times w transpose x; a smooth alternative loss commonly used for probabilistic classification.

New cards

kernel function

Kernel of x and z equals the inner product of phi of x and phi of z; computes similarity in feature space without explicitly computing phi.

New cards

Gaussian kernel

Kernel of x and z equals e to the negative squared Euclidean distance between x and z divided by two sigma squared; emphasizes nearby points (radial basis behavior).

New cards

polynomial kernel

Kernel of x and z equals gamma times x transpose z plus constant c, all raised to the degree d; captures interactions up to the chosen degree.

New cards

dual solution

w equals the sum over i of alpha sub i times y sub i times x sub i (for linear SVM after solving the dual problem); only points with nonzero alpha contribute.

New cards

KNN classification formula

Prediction at x equals the sign of the sum over i in N sub k of x of y sub i; majority vote among the k nearest neighbors.

New cards

KNN regression formula

Prediction at x equals one over k times the sum over i in N sub k of x of y sub i; average of the k nearest neighbor outputs.

New cards

Euclidean distance

Distance between x and z equals the square root of the sum over coordinate j of (x sub j minus z sub j) squared; straight-line distance in d dimensions.

New cards

Jaccard distance

Distance between sets A and B equals one minus (size of A intersection B divided by size of A union B); measures dissimilarity based on overlap.

New cards

Gini impurity

One minus the sum over classes c of p sub c squared; higher values mean the node’s labels are more mixed.

New cards

Bayes rule

Probability of B given A equals probability of A given B times probability of B divided by probability of A (assuming probability of A is not zero); connects inverse conditionals.

New cards

Naive Bayes formula

Probability of class y given features x is proportional to probability of y times the product over features i of probability of x sub i given y; for classification, the denominator probability of x is omitted because it is constant across y.

New cards

K-means objective

Minimize the sum over data points i of the minimum over clusters k of the squared Euclidean distance between x sub i and mu sub k; this minimizes within-cluster sum of squares.

New cards

L2 norm

The L2 norm of w equals the square root of the sum over components j of w sub j squared; it measures vector length.

New cards

margin violation

A point violates the margin if y sub i times w transpose x sub i is less than one; equivalently, its hinge loss is positive.

New cards

eigenvector

A vector v such that A times v equals lambda times v; it is a direction preserved by the linear transformation up to scaling.

New cards

logistic function

One over one plus e to the negative input; maps any real number to a value between zero and one (often interpreted as a probability).