L2 - Robust estimation of multivariate location and scatter

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/51

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

52 Terms

1
New cards

How do practitioners often check for outliers?

knowt flashcard image
2
New cards
<p>What is the problem with this approach?</p>

What is the problem with this approach?

Multivariate outliers may not be extreme in any variable. Such outliers are called correlation outliers.

3
New cards

What is the idea for detecting correlation outliers?

knowt flashcard image
4
New cards
<p>What kind of problem does this lead to?</p>

What kind of problem does this lead to?

knowt flashcard image
5
New cards
<p>What are those methods?</p>

What are those methods?

Minimum covariance determinant (MCD) estimator

One-step reweighted estimator

6
New cards

What is the idea of the MCD estimator?

knowt flashcard image
7
New cards

What is the geometric meaning of the determinant of the covariance matrix?

Geometrically, the determinant of the covariance matrix is proportional to the (hyper)volume

8
New cards

How do we denote the sample mean that uses observations from subset H?

<p></p>
9
New cards

How do we denote the sample covariance matrix that uses observations from subset H?

knowt flashcard image
10
New cards

What is the objective function for finding HMCD?

knowt flashcard image
11
New cards
<p>What is the functional for the mean over a set A?</p>

What is the functional for the mean over a set A?

knowt flashcard image
12
New cards
<p>What is the functional for the covariance matrix over a set A? </p>

What is the functional for the covariance matrix over a set A?

knowt flashcard image
13
New cards

What are the functionals of the MCD estimator of location and scatter?

knowt flashcard image
14
New cards
term image
knowt flashcard image
15
New cards
term image
knowt flashcard image
16
New cards
term image
knowt flashcard image
17
New cards

Are the MCD estimator or location and scatter Fisher consistent under a normal distribution N(µ,Σ)?

knowt flashcard image
18
New cards

What is the Fisher consistent, robust estimator or the covariance matrix then?

knowt flashcard image
19
New cards

The subset size h can be seen as a conservative initial guess of the number of good data points. What is the problem with this?

knowt flashcard image
20
New cards

Conservative choice of h trades in efficiency to ensure robustness. How can we regain some efficiency?

We use MCD estimate to detect outliers and exclude only those from estimation of mean and covariance matrix

21
New cards

What is Mahalanobis distance? Give the formula and explain what it means geometrically.

knowt flashcard image
22
New cards
term image
knowt flashcard image
23
New cards

What are the weights from outlier detection via the MCD estimator?

knowt flashcard image
24
New cards

How do we compute the reweighted MCD location estimate, given the weights?

knowt flashcard image
25
New cards

How do we compute the reweighted MCD covariance matrix estimate, given the weights?

knowt flashcard image
26
New cards

What are muhat_rwtd and sigmahat_rwtd?

knowt flashcard image
27
New cards

When people talk about the MCD, they typically refer to …

the reweighted estimators ˆµrwgt and ˆ Σrwgt

28
New cards
term image

the raw MCD estimators

29
New cards

What does affine equivariance imply for the data?

Data may be rotated, translated or rescaled without affecting the properties of T(X) and S(X)

30
New cards

When is an estimator of location T(X) affine equivariant?

knowt flashcard image
31
New cards

When is an estimator of scatter S(X) affine equivariant?

knowt flashcard image
32
New cards

Are the reweighted MCD estimators affine equivariant? Give your reasoning.

<p></p>
33
New cards

Give the finite sample breakdown point of location estimator Tn in the multivariate setting.

knowt flashcard image
34
New cards

Give the finite sample breakdown point of scatter estimator Sn in the multivariate setting.

knowt flashcard image
35
New cards

What is the upper bound on the breakdown point of affine equivariant estimates of scatter?

knowt flashcard image
36
New cards

What is the upper bound on the breakdown point of affine equivariant estimates of location?

knowt flashcard image
37
New cards

What is the optimal subset size? Maximum possible breakdown point

knowt flashcard image
38
New cards

What is the distribution of the MCD estimator and its convergence rate?

asymptotically normally distributed with convergence rate √n

39
New cards

What is the convergence rate of the reweighted estimator?

A reweighting step does not improve the rate of convergence of the initial estimator

40
New cards

What is the really big problem with the MCD estimator in practice? And what is the solution?

knowt flashcard image
41
New cards

What is the basic C-step algorithm? What is the problem with it?

If an elemental subset contains an outlier, it will influence all further iterations. In this case fully iterating C-steps until convergence is a waste of computation time.

<p>If an elemental subset contains an outlier, it will influence all further iterations. In this case fully iterating C-steps until convergence is a waste of computation time.</p>
42
New cards

What is the FAST-MCD Algorithm?

knowt flashcard image
43
New cards

What is the idea of the Deterministic MCD (DetMCD) algorithm?

Instead of using many random starting values, use only a few good ones

44
New cards

Give the DetMCD Algorithm.

knowt flashcard image
45
New cards

What is an alternative to MCD and all methods derived from it?

knowt flashcard image
46
New cards

What to take away from this lecture?

knowt flashcard image
47
New cards

If an estimator is non-parametric, does that make it robust?

No, not necessarily. The sample mean is non-parametric yet it is not robust

48
New cards

For a symmetric distribution, is the sample mean Fisher consistent for the true mean mu? 

Yes

49
New cards

For a symmetric distribution, is the sample median Fisher consistent for the true mean mu? 

Yes

50
New cards

For an asymmetric distribution, is the sample mean Fisher consistent for the true mean mu? 

Yes

51
New cards

For an asymmetric distribution, is the sample median Fisher consistent for the true mean mu? 

No (so we would need a correction)

52
New cards

For an asymmetric distribution, is the sample mean Fisher consistent for the median?

No (so we would need a correction)