Week 8 - Bayes Classifier, Logistic Regression, ROC and Confusion Tables

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/8

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 2:22 PM on 5/26/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

9 Terms

1
New cards

Bayes Classifier

Try to find a function that approximately represents

𝑓l(𝑥) ≈ ℙ( 𝑦 = 𝑙 | 𝑥).

Now 𝑓l(𝑥) ∈ [0,1] provides an estimate on the probability that the data sample 𝑥 is labelled 𝑦 = 𝑙

2
New cards

Logistic Regression

A popular choice for the probability function is the logistic function (sigmoid)

f(x) = 1/(1 + e^-x)

3
New cards

Likelihood Function

Aim to maximise the probability of observing the training data assuming our ML model is correct

Optimal parameters are such that l(a0, a1) ≈ 1 and suboptimal parameters are such that l(a0, a1) ≈ 0

<p>Aim to maximise the probability of observing the training data assuming our ML model is correct</p><p>Optimal parameters are such that l(a<sub>0</sub>, a<sub>1</sub>) ≈ 1 and suboptimal parameters are such that l(a<sub>0</sub>, a<sub>1</sub>) ≈ 0</p>
4
New cards

Advantages of K-Nearest Neighbours

  • Does not assume that our model, 𝑓(𝑥), takes a certain parametric structure.

  • Since there are no parameters, no need to optimize.

  • Can represent complex boundary conditions.

5
New cards

Disadvantages of K-Nearest Neighbours

  • Arbitrarily chooses a metric for closeness (the Euclidean norm ||. ||.). Depending on the units and scale of the data this may not be the best metric.

  • Arbitrarily chooses 𝑘 ∈ ℕ, the size of the neighborhood.

  • Only considers what is occurring locally around data point where global properties may be important.

  • A lot of online computation is required to make each prediction (we are required to compute 𝑁k(𝑥0)).

6
New cards

Confusion Matrix

  • TPR = number of correctly predicted positive labels / number of truly positive data

  • FNR = number of incorrectly predicted negative labels / number of truly positive data

  • FPR = number of incorrectly predicted positive labels / number of truly negative data

  • TNR = number of correctly predicted negative labels / number of truly negative data

<ul><li><p>TPR = number of correctly predicted positive labels / number of truly positive data</p></li><li><p>FNR = number of incorrectly predicted negative labels / number of truly positive data</p></li><li><p>FPR = number of incorrectly predicted positive labels / number of truly negative data</p></li><li><p>TNR = number of correctly predicted negative labels / number of truly negative data</p></li></ul><p></p>
7
New cards

Sensitivity and Specificity

Unlike regression models, classification problems are interested in measuring class-specific performance

  • Sensitivity = TPR = percentage of positive cases correctly identified

  • Specificity = TNR = percentage of negative cases correctly identified

8
New cards

ROC Curves

The performance of a classification model is done by plotting a Receiver Operating Characteristic (ROC) curve

  • Given a classification model for different threshold values, we plot the TPR vs FPR to get the ROC curve

  • We would like to find a classifier that has TPR = 1 and FPR = 0

<p>The performance of a classification model is done by plotting a Receiver Operating Characteristic (ROC) curve</p><ul><li><p>Given a classification model for different threshold values, we plot the TPR vs FPR to get the ROC curve</p></li><li><p>We would like to find a classifier that has TPR = 1 and FPR = 0</p></li></ul><p></p>
9
New cards

AUC (Area Under the ROC Curve)

  • Provides a metric for the performance of a classifier over all thresholds

  • Theoretically the best classifier has an AUC of one

<ul><li><p>Provides a metric for the performance of a classifier over all thresholds</p></li><li><p>Theoretically the best classifier has an AUC of one</p></li></ul><p></p>