L5 - Discriminant Analysis

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/50

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

51 Terms

1
New cards

What is discriminant analysis?

prediction of groups when groups are known a priori (supervised clustering)

2
New cards

How do we denote the g populations/groups/categories?

knowt flashcard image
3
New cards

What would the different populations be and what would X be in case we look at diseases?

Populations can be different diseases, and predictors can be the symptoms of a patient (X)

4
New cards

What would the different populations be and what would X be in the Swiss bank notes example?

Populations are whether banknotes are forged or genuine, and predictors are measurements of the banknotes

5
New cards

What is a discriminant rule d?

knowt flashcard image
6
New cards

When is discrimination more accurate?

if Πj has high concentration of its probability in Rj

7
New cards

What is the idea of the maximum likelihood discriminant rule?

Allocate X to a group which gives the largest likelihood to X

8
New cards

The idea of the maximum likelihood discriminant rule is to allocate X to a group which gives the largest likelihood to X. Write this down mathematically.

<p></p>
9
New cards
<p>What do we do if X = 0? What if X = 1? Explain.</p>

What do we do if X = 0? What if X = 1? Explain.

knowt flashcard image
10
New cards

What does LDA stand for?

Linear discriminant analysis

11
New cards

What are the LDA assumptions?

knowt flashcard image
12
New cards

Give the theorem about linear discriminant analysis.

knowt flashcard image
13
New cards
<p>How is this calculated again?</p>

How is this calculated again?

knowt flashcard image
14
New cards
term image
knowt flashcard image
15
New cards
<p>Prove this</p>

Prove this

knowt flashcard image
16
New cards
<p>What to do when this assumption is not correct?</p>

What to do when this assumption is not correct?

Quadratic discriminant analysis

17
New cards

What is the likelihood for group i?

knowt flashcard image
18
New cards

What is the discriminant function (including prior information)?

knowt flashcard image
19
New cards

Why is it called quadratic discriminant analysis?

knowt flashcard image
20
New cards

What is the discriminant (classification) rule for quadratic discriminant analysis?

assign X to Πj when δi(X) is maximized

21
New cards

What is the practical issue with QDA?

knowt flashcard image
22
New cards

What is a solution to the practical issues we face when applying QDA?

knowt flashcard image
23
New cards

What is Bayes discriminant rule?

knowt flashcard image
24
New cards

If no prior probabilities are available then Bayes discriminant rule …

is the same as ML discriminant rule.

25
New cards

If g =2 then the discriminant function (1) is shifted by …

knowt flashcard image
26
New cards

A randomized discriminant rule d involves allocating observation X to a population j with probability … [symbol]

knowt flashcard image
27
New cards
<p>What can we say about these?&nbsp;</p>

What can we say about these? 

knowt flashcard image
28
New cards

What is the deterministic allocation rule? phi_j(X) = …

knowt flashcard image
29
New cards

The probability of allocating X to population Πi when it comes from Πj is pij = …

knowt flashcard image
30
New cards

When is a discriminant rule admissible?

knowt flashcard image
31
New cards

Which theorem fo you know about admissibility of discriminant rules?

All Bayes discriminant rules (including the ML rule) are admissible

32
New cards

What if we do not want or cannot use a parametric form of the distribution of the populations? Can we find a reasonable discrimination rule in this case?

Yes, with Fisher’s idea!  

<p>Yes, with Fisher’s idea!&nbsp;&nbsp;</p>
33
New cards

How do we calculate B (Between group sum of squares?

knowt flashcard image
34
New cards

How do we calculate W (Within group sum of squares)?

knowt flashcard image
35
New cards
<p>What does this H matrix look like?</p>

What does this H matrix look like?

knowt flashcard image
36
New cards

What is the maximization we do with Fisher’s linear discriminant function?

knowt flashcard image
37
New cards

What is the solution to this maximization problem (Fisher)?

knowt flashcard image
38
New cards

What is the discriminant score of Fisher’s linear discriminant function?

knowt flashcard image
39
New cards

What is the general form of Fisher’s rule?

knowt flashcard image
40
New cards

What is Fisher’s rule in the case of two groups?

knowt flashcard image
41
New cards

Compare Fisher’s discriminant rule to the Maximum Likelihood discriminant rule.

  • For g = 2: Same as ML rule but not based on multivariate normality! Although, the mathematical justifications are different.

  • For g ≥3: the rules are (in general) different from ML rules

42
New cards

Let us incorporate the misclassification costs into the discriminant rule. How do we define the loss function in this course?

knowt flashcard image
43
New cards

Suppose that d is an allocation rule then the risk function is defined by …

knowt flashcard image
44
New cards
<p>Give an interpretation of the risk function.</p>

Give an interpretation of the risk function.

It is an expected loss, given that the observation comes from Πj

45
New cards

How do we compute the Bayes risk?

knowt flashcard image
46
New cards

What is the interpretation of the Bayes risk?

The posterior expected loss

47
New cards

What is the Bayes rule?

knowt flashcard image
48
New cards

The classical approach for discriminant analysis is to use sample mean and sample (co)variance estimators of µ and Σ. We know that they are very sensitive to outliers. What to do?

We can simply use the robust estimators

49
New cards

Which robust estimators should we use for discriminant analysis?

knowt flashcard image
50
New cards

Croux et al. (2008) shown that the first order influence function of the classification error rate vanishes. What does this mean?

This roughly means that the loss of efficiency from the use of robust estimators is not transferred to the performance measure

→ The use of robust methods for classification is worthwhile

51
New cards

What to take away from this lecture?

knowt flashcard image