08 Spotting Culprits in Epidemics: How many and Which ones?

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/29

flashcard set

Earn XP

Description and Tags

Vocabulary flashcards summarizing the key terms, models, metrics, and algorithmic concepts introduced in the NETSLEUTH lecture.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

30 Terms

1
New cards

Culprit (Seed Node)

A node from which an infection begins spreading through the network.

2
New cards

NETSLEUTH

An algorithm that detects the number and identity of seed nodes in an epidemic snapshot using the MDL principle.

3
New cards

Susceptible-Infected (SI) Model

A two-state epidemic model where susceptible nodes become permanently infected with probability β whenever contacted by an infected neighbor.

4
New cards

Attack Probability (β)

The per-time-step chance that an infected node transmits infection to each susceptible neighbor in the SI model.

5
New cards

Propagation Ripple

The time-ordered list of node sets describing how infection spreads from seeds to form the observed snapshot.

6
New cards

Frontier Set (F)

The set of currently uninfected nodes that have at least one infected neighbor at a given time step.

7
New cards

Attack Degree

The number of infected neighbors currently trying to infect a given susceptible node.

8
New cards

Exoneration (in NETSLEUTH)

The idea that surrounding uninfected nodes reduce the likelihood that their infected neighbors are true seeds.

9
New cards

Minimum Description Length (MDL) Principle

A model-selection framework that chooses the explanation giving the shortest combined code length for model plus data.

10
New cards

Model Cost (L(S))

The MDL code length required to specify the seed set, including its size and the identities of the seed nodes.

11
New cards

Data Cost (L(R | S))

The MDL code length needed to describe the infection ripple given a specific seed set.

12
New cards

Universal Code for Integers (LN)

An MDL optimal encoding that represents smaller integers with fewer bits by using a log* based scheme.

13
New cards

Data-to-Model Code

A code that indexes directly into an ordered list of all data instances consistent with the model, used to encode seed node identities.

14
New cards

Adjacency Matrix (A)

A square matrix where entry (i, j) equals 1 if nodes i and j are connected by an edge, 0 otherwise.

15
New cards

Degree Matrix (D)

A diagonal matrix whose entries list the degree (number of neighbors) of each node.

16
New cards

Graph Laplacian (L(G))

The matrix D − A that captures connectivity; its eigenvectors reveal structural properties.

17
New cards

Submatrix-Laplacian (LA)

The principal submatrix of the full Laplacian corresponding only to infected nodes, central to NETSLEUTH’s seed scoring.

18
New cards

Smallest Eigenvector of LA

The eigenvector used by NETSLEUTH whose largest entries point to the most likely single seed node.

19
New cards

Perron–Frobenius Theorem

A result ensuring the largest eigenvalue of a positive matrix is real and has a positive eigenvector, supporting NETSLEUTH’s ranking.

20
New cards

Maximum Likelihood Estimation (MLE) Ripple

The infection sequence that maximizes the probability of producing the snapshot under the SI model.

21
New cards

Jaccard Distance (JD)

A similarity metric defined as 1 − |A ∩ B| / |A ∪ B|, used to evaluate how well a recovered footprint matches the true infected set.

22
New cards

QMDL Score

The ratio of the MDL length of recovered seeds to that of the true seeds; values near 1 indicate high quality.

23
New cards

QJD Score

The ratio of average Jaccard distance for recovered seeds to that for true seeds; values near 1 indicate high quality.

24
New cards

Rumor Centrality

A method that identifies a single infection source on trees by ranking nodes according to how many infection orders they can generate.

25
New cards

Effectors Problem

Finding k seed nodes under the Independent Cascade (IC) model that could have produced a steady-state activation snapshot.

26
New cards

Independent Cascade (IC) Model

An epidemic model where each infected node gets one chance to infect each neighbor independently with a fixed probability.

27
New cards

Linear Scalability (of NETSLEUTH)

The algorithm’s total running time grows proportionally with the number of edges and infected nodes.

28
New cards

Eigenvalue Gap

The difference between the largest and second-largest eigenvalues; a large gap justifies focusing on the leading eigenvector.

29
New cards

Influence Maximization

The task of selecting nodes to maximize future spread; conceptually opposite to tracing back culprits.

30
New cards

Immunization Problem

Choosing nodes to remove or vaccinate to halt or slow an epidemic in a network.