1/20
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Grounding in NLP
Connecting textual mentions in a corpus to real-world entities or concepts.
Entity Linking
Grounding named entities by linking them to real-world references (e.g., linking 'Einstein' to a Wikipedia page).
Word Sense Disambiguation (WSD)
Resolving ambiguity in common words based on context.
Key challenges in grounding and WSD
Polysemy (words with multiple meanings, e.g., 'pen' as a writing tool or enclosure) and homonymy (words that sound the same but have different meanings).
WordNet
A lexical database grouping synonyms into synsets and encoding relationships like hypernymy (is-a), hyponymy (type-of), and meronymy (part-of).
BabelNet
A multilingual semantic network combining WordNet and Wikipedia.
Methods for WSD
Dictionary-based (matches words with lexical resources), supervised learning (uses annotated corpora), and unsupervised learning (clusters similar word contexts without labeled data).
WSD Evaluation
Intrinsic evaluation (accuracy on test datasets) and extrinsic evaluation (impact on downstream tasks like information retrieval).
Bag-of-Words (BOW) representation
Represents text as an unordered collection of its words (tokens).
Variants of BOW
Baseline BOW (simple token list) and filtered BOW (includes case sensitivity, lemmatization, or POS tagging).
Applications of BOW
Text classification and sense comparison by calculating overlaps between BOWs derived from WordNet synsets and target text.
Entity Linking
Matches mentions to specific real-world instances (e.g., linking 'Pen Tennyson' to a person in Wikipedia).
Focus of Disambiguation
Determining the intended sense of a word without linking it to external resources.
Fine-Grained Representation
Differentiates closely related senses (e.g., WordNet synsets).
Coarse-Grained Representation
Groups related senses into broader categories (e.g., 'vehicle' for 'car, truck, bus').
Advantages of Fine-Grained Representations
More precise but require more computational effort.
Advantages of Coarse-Grained Representations
Less precise but faster for large-scale tasks.
Case Sensitivity in Grounding
Differentiates proper nouns (e.g., 'Pen' as a name) from common nouns (e.g., 'pen' as a tool) and handles acronyms (e.g., 'NASA').
Case Handling in BOWs
Normalization for general contexts and preservation for entity-specific tasks.
WordNet Browser
A web-based interface for exploring synsets and their relationships.
Wikipedia and BabelNet in Grounding
Large-scale grounding spaces integrating definitions with entity data.