Grounding and Word Sense Disambiguation in NLP

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/20

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

21 Terms

1
New cards

Grounding in NLP

Connecting textual mentions in a corpus to real-world entities or concepts.

2
New cards

Entity Linking

Grounding named entities by linking them to real-world references (e.g., linking 'Einstein' to a Wikipedia page).

3
New cards

Word Sense Disambiguation (WSD)

Resolving ambiguity in common words based on context.

4
New cards

Key challenges in grounding and WSD

Polysemy (words with multiple meanings, e.g., 'pen' as a writing tool or enclosure) and homonymy (words that sound the same but have different meanings).

5
New cards

WordNet

A lexical database grouping synonyms into synsets and encoding relationships like hypernymy (is-a), hyponymy (type-of), and meronymy (part-of).

6
New cards

BabelNet

A multilingual semantic network combining WordNet and Wikipedia.

7
New cards

Methods for WSD

Dictionary-based (matches words with lexical resources), supervised learning (uses annotated corpora), and unsupervised learning (clusters similar word contexts without labeled data).

8
New cards

WSD Evaluation

Intrinsic evaluation (accuracy on test datasets) and extrinsic evaluation (impact on downstream tasks like information retrieval).

9
New cards

Bag-of-Words (BOW) representation

Represents text as an unordered collection of its words (tokens).

10
New cards

Variants of BOW

Baseline BOW (simple token list) and filtered BOW (includes case sensitivity, lemmatization, or POS tagging).

11
New cards

Applications of BOW

Text classification and sense comparison by calculating overlaps between BOWs derived from WordNet synsets and target text.

12
New cards

Entity Linking

Matches mentions to specific real-world instances (e.g., linking 'Pen Tennyson' to a person in Wikipedia).

13
New cards

Focus of Disambiguation

Determining the intended sense of a word without linking it to external resources.

14
New cards

Fine-Grained Representation

Differentiates closely related senses (e.g., WordNet synsets).

15
New cards

Coarse-Grained Representation

Groups related senses into broader categories (e.g., 'vehicle' for 'car, truck, bus').

16
New cards

Advantages of Fine-Grained Representations

More precise but require more computational effort.

17
New cards

Advantages of Coarse-Grained Representations

Less precise but faster for large-scale tasks.

18
New cards

Case Sensitivity in Grounding

Differentiates proper nouns (e.g., 'Pen' as a name) from common nouns (e.g., 'pen' as a tool) and handles acronyms (e.g., 'NASA').

19
New cards

Case Handling in BOWs

Normalization for general contexts and preservation for entity-specific tasks.

20
New cards

WordNet Browser

A web-based interface for exploring synsets and their relationships.

21
New cards

Wikipedia and BabelNet in Grounding

Large-scale grounding spaces integrating definitions with entity data.