wk3 - minimising risk and linear classifiers

0.0(0)

Studied by 0 people

0.0(0)

Call with Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/39

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No study sessions yet.

40 Terms

New cards

decision boundary

the line (or lines) separating classes

the boundary is a property of the classifier

i.e. different types of classifier will have different boundaries even when trained on the same data

New cards

what is the boundary of a N-D feature space formed of

N - 1 dimensional 'hypersurfaces'

New cards

finding the decision boundary

in many cases the decision boundary is when P(x1 | x) = p(w2 | x)

and if the priors are equal they will cancel out leaving

p(x = x0 | ω1) = p(x = x0 | ω2)

New cards

how to find the probability of an error

an error will occur when the pdfs overlap. we can define this by:

calculating the area of each past the decision boundary then add them together

New cards

what is loss

this is the cost of misclassifying something from class i as belonging to class j

New cards

how is average risk defined in a two class problem?

r = λ21P(x∈R1,ω2)+λ12P(x∈R2,ω1)

note: when the loss is equal, the risk is the same as the probability of an error. so in default case, minimising error also minimises the risk

New cards

how is average risk minimised

by selecting partitioning regions so thar it is the class with the lowest loss

for all of x in Ri

if li(x) < lj(x) for all j =/= i

New cards

in a two class classification problem, what are the two types of errors?

1. classifying an object from class w2, as belonging to w1

2. classifying an object from w1 as w2

New cards

what does λ₁₂ represent in classification

the loss (cost) of misclassifying something from class ω₁ as belonging to class ω₂

New cards

for the case of λ₁₁ = λ₂₂ = 0, λ₁₂ = λ₂₁ = 1

what does the risk minimisation rule simplify to

choose class ω₁ if

p(x∣ω2)P(ω2) < p(x∣ω1)P(ω1)

New cards

what is a feature vector and where does it exist?

a feature vector contains all features for a sample

it exists in an L-dimensional feature space where each sample is a point

New cards

what two components define a vector

magnitude (length) and direction

New cards

what is the inner/dot product of two vectors?

measures the magnitude of the projection of one vector onto another

multiply corresponding components and add them all together

the result is a single scalar number

New cards

what does the dot product measure geometrically

how much one vector points in the direction of another

New cards

how do you find the projection of vector x onto vector y

you divide the dot product of x and y, by the dot product of y with itself. then multiply by y

New cards

what is the outer product of two vectors

multiply each components of the first vector by each component of the second vector. this produces a matrix

New cards

if A is MxL, and B is Lx1, what is the size of the result of AB

Mx1 vector (M rows, 1 column)

New cards

what is the lp-norm distance between two vectors x and y?

sum of absolute differences raised to the power p, then take the p-th root of the sum

New cards

what is Euclidean distance?

this is the l2-norm distance, where p=2.

it is the square root of the sum of the squared distances.

New cards

what is Manhattan distance?

this is the l1-norm distance,

where p=1

it is the sum of the absolute differences

New cards

if you see ||x - y||, without a subscript, which norm is usually meant?

l2-norm, p=2

euclidean distance

New cards

what is the key difference between manhattan and euclidean distance visually

manhattan measures 'grid' distance, euclidean measures straight line distance

New cards

what is cosine similarity between two vectors

cosine of angle between them = (dot product) / (product of lengths)

New cards

how is the cosine distance different from cosine similarity?

cosine distance = 1 - cosine similarity

increases as vectors become less similar

New cards

why use cosine distance instead of euclidean?

cosine distance is scale-invariant. it only considers angle, not magnitude.

New cards

what are two types of proximity measures

1. dissimilarity measures - larger value = further apart

2. similarity measures - larger value = closer

New cards

when is a function a valid dissimilarity measure

1. it returns the same d0 when measuring similarity between point a and itself

d(x, x) = d0, ∀x ∈ X

2. dissimilarity between two points is never less than d0

d(x, y) ≥ d0, ∀x, y ∈ X

New cards

when is a function a valid similarity measure

1. it returns the same s0 value when measuring similarity between a point and itself

s(x, x) = s0 ∀x ∈ X

2. similarity between two points is never greater than s0

s(x, y) ≤ s0 ∀x, y ∈ X

New cards

core idea of one-dimensional gaussian probability

the probability decays exponentially with the squared distance from the mean (x−μ)^2

New cards

to extend the gaussian to L dimensions, what distance measure is naturally used?

the euclidean distance

New cards

main limitation of using simple euclidean distance for a multivariate gaussian

ignores any covariance (relationships) between features

it treats all dimensions as independent and equally scaled

New cards

what distance measure is used to account for feature scaling and correlation

the mahalanobis distance

New cards

mahalanobis distance

distance between vectors is the units of covariance

it neasures how many standard devations apart the points are, accounting for correlations

New cards

what are the two parameters of a univariate gaussian

mean

variance

New cards

linear classifier

simply a classifier that can only generate linear decision boundaries

New cards

for a gaussian classifier, what must we estimate for each class from the training data

the class mean vector and covariance matrix

New cards

in a discriminative classifier, what can we directly use as a discriminant function gi(x)?

the posterior probability

P(ωi∣x) itself

New cards

how can we find the decision boundary between two classes wi and wj using discriminant functions gi(x)?

solve the equation

gi(x) = gj(x) = 0

where

gi(x) ≡ f(P(wi | x))

New cards

for bayesian classifiers, what specific form of the discriminant function gi(x) do we normally use

gi(x) = ln P(wi | x)

New cards

under what condition would a gaussian classifier produce a linear decision boundary

when all classes share the same covariance matrix

then the quadratic terms cancel, leaving a linear function

Explore top notes

Ch 3: Biodiversity and Conservation

Updated 1010d ago

Note

PROCESS: Determining the Chemical (copy) (copy)

Updated 514d ago

Note

Overview about Crimes

Updated 1079d ago

Note

APUSH: REVIEW OF UNIT 1

Updated 129d ago

Note

Chapter 4 - Elasticity

Updated 1262d ago

Note

Torts

Updated 532d ago

Note

CIE A2 Level English: Language Change

Updated 194d ago

Note

Princeton Review AP Calculus BC, Chapter 12: Infinite Sequences and Series

Updated 1020d ago

Note

Ch 3: Biodiversity and Conservation

Updated 1010d ago

Note

PROCESS: Determining the Chemical (copy) (copy)

Updated 514d ago

Note

Overview about Crimes

Updated 1079d ago

Note

APUSH: REVIEW OF UNIT 1

Updated 129d ago

Note

Chapter 4 - Elasticity

Updated 1262d ago

Note

Torts

Updated 532d ago

Note

CIE A2 Level English: Language Change

Updated 194d ago

Note

Princeton Review AP Calculus BC, Chapter 12: Infinite Sequences and Series

Updated 1020d ago

Note

Explore top flashcards

Flashcards (32)

Flashcards (36)

Julius Caesar Act 4-5 Vocabulary

Updated 1154d ago

Flashcards (23)

El mundo hispanohablante

Updated 143d ago

Flashcards (118)

EC Unidad 4; La Lista Entera

Flashcards (99)

Flashcards (32)

Flashcards (36)

Photosynthesis - Bio 8

Flashcards (24)

Flashcards (32)

Flashcards (36)

Julius Caesar Act 4-5 Vocabulary

Updated 1154d ago

Flashcards (23)

El mundo hispanohablante

Updated 143d ago

Flashcards (118)

EC Unidad 4; La Lista Entera

Flashcards (99)

Flashcards (32)

Flashcards (36)

Photosynthesis - Bio 8

Updated 1135d ago

Flashcards (24)