Data mining MCQ 2

0.0(0)
studied byStudied by 4 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/39

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

40 Terms

1
New cards

In graph mining, a graph is best defined as:

A set of nodes connected by edges

2
New cards

What does the degree of a node represent?

The number of edges connected to the node

3
New cards

In an Erdős–Rényi random graph, what does the parameter p represent?

The probability that an edge exists between two nodes

4
New cards

What typically happens when n · p ≈ 1 in an Erdős–Rényi graph?

A giant connected component starts to emerge

5
New cards

Why is the largest connected component often studied in random graphs?

It reveals connectivity and phase transition properties

6
New cards

What information does a degree distribution provide?

How node degrees are distributed across the graph

7
New cards

Which characteristic is typical of an LFR benchmark graph?

Power-law degree distribution with known communities

8
New cards

Why are LFR graphs commonly used in community detection studies?

They provide ground-truth communities

9
New cards

What is a community in a graph?

A set of nodes densely connected internally and sparsely connected externally

10
New cards

What is the main principle behind the Louvain community detection algorithm?

Maximizing modularity

11
New cards

Why might the Louvain algorithm fail to recover the ground-truth communities of an LFR graph?

Maximizing modularity does not always match true communities

12
New cards

What does Normalized Mutual Information (NMI) measure?

Similarity between two partitions of nodes

13
New cards

What is the key idea of the Girvan–Newman algorithm?

Removing edges with high betweenness

14
New cards

Compared to Louvain, the Girvan–Newman algorithm is often:

More computationally expensive but sometimes more accurate

15
New cards

What is the goal of graph (node) embeddings such as Node2Vec?

To convert nodes into low-dimensional vector representations

16
New cards

What is an Erdős–Rényi random graph?

A graph where each pair of nodes is connected with probability p

17
New cards

What is an LFR graph mainly used for?

Benchmarking community detection algorithms

18
New cards

What is community detection?

Finding groups of densely connected nodes

19
New cards

What is the Louvain algorithm based on?

Modularity maximization

20
New cards

What is the core principle of the Girvan–Newman algorithm?

Removing edges with high betweenness

21
New cards

What does normalized Mutual Information (NMI) measure?

Similarity between two community partitions

22
New cards

What is the goal of node embeddings?

To map nodes into low-dimensional vector spaces

23
New cards

LFR graph

They have a priori known communities and are used to compare different community detection methods

24
New cards

Which statement best compares Louvain and Girvan–Newman?

Louvain maximizes modularity, Girvan–Newman removes high-betweenness edges

25
New cards

Why can Girvan–Newman outperform Louvain on LFR graphs?

explicitly separates communities via edge removal

26
New cards

What is a key difference between LFR and Erdős–Rényi graphs?

LFR graphs have realistic degree distributions and known communities

27
New cards

What is the main difference between Node embeddings and Community detection ?

Community detection finds groups; embeddings create vector representations

28
New cards

Why is NMI preferred over raw accuracy for community detection?

Labels are arbitrary and permutation-invariant

29
New cards

Acceed the ground-truth communities of the graph with nx.get node attributes(lfr,’community’)

The returned ground-truth is a dictionnary, which keys correspond to nodes, and values corre spond to a set of nodes forming a community. The communities are disjoint, meaning that each node is contained in one single community

30
New cards

Divisive clustering on Edge-Betweenness

  • You start with the entire network as a single cluster.

  • Then, you recursively split it into smaller communities until meaningful groups emerge

31
New cards

Why are Erdős–Rényi graphs often considered unrealistic models of social networks?

They assume uniform edge probability between all node pairs

32
New cards

LFR graph degree distribution ?

heavy-tailed (power-law-like) distribution

33
New cards

Why is the Karate Club graph commonly used in graph mining?

It has a well-known real community split

34
New cards

Why can the Louvain algorithm fail to recover the true communities in an LFR graph?

Modularity maximization may not align with planted communities

35
New cards

Which statement about scalability is correct?

Louvain is generally more scalable than Girvan–Newman

36
New cards

What is the main conceptual difference between edge betweenness and modularity?

Edge betweenness identifies bridges; modularity evaluates partition quality

37
New cards

How does community detection differ from finding connected components?

Communities allow sparse connections between groups

38
New cards

Why is Normalized Mutual Information (NMI) preferred over accuracy for evaluating communities?

Community labels are arbitrary and unordered

39
New cards

How do node-embedding-based methods differ from classical community detection?

They transform nodes into vectors before clustering

40
New cards

Which comparison is correct?

Degree counts neighbors, betweenness counts shortest-path participation