1/18
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Event
Linguistic: verb
Information Retrieval: a relation of entities
Named Entity
A unique entity that is identified by its name. Can be of form:
Persons
Locations
Organizations
Dates
Times
Numeric expressions
Or domain specific
Information Extraction Systems Goals
Identify and understand relevant parts of texts
Gather, collate, and link information within and between documents in the corpus
Produce a structured representation of relevant information
Organize information so that it is useful to people
Store information in a semantically precise format that is usable by algorithms
IE Pipeline
Recognition of named entities
Extraction of relations between entities
Knowledge base population
Name Entity Recognition
The goal:Identify named entities in a document and tag them with a type

Encoding Classes for NER
Inside-Outside (IO) encoding is less precise
Inside-Outside-Beginning (IOB) encoding is more precise but requires
a larger tagset
the benefit is limited in practice, so IO is often used
Features for NER
Token level:current/prev/next
Tag level:Inferred linguistic classification
Label Level:Previous (and perhaps next) named entity label in the current sequence
Substrings in NER
we can detect patterns of substrings in certain domains however there are exceptions
Token Shape
Some named entity names tend to follow patterns that can be mapped to a simplified representation based on attributes such as:
Token length,
Capitalization,
Numerals,
Greek letters,
Internal punctuation,
etc.
NER models
Markov Models
Conditional Markov Models make a single decision at a time,
conditioned on evidence from observations and previous decisions
Conditional Random Fields
A whole-sequence conditional model, rather than a chain of local models
Deep learning models
Bidirectional Long Short Term Memory models (LSTMs)
Transformers
Handling Ambiguity: Normalization
reducing or rewriting something to a common (normal) form.
NE disambiguation
The task of deciding whether two entity mentions refer to the same entity.
NE linking
The task of linking an entity mention to a unique identifier.
Linking with knowledge graph
For persons, organizations, and locations, we typically link to an entry in a knowledge graph. The most commonly used resource is Wikidata.
Entities have
A unique Q-identifier
Properties that connect them to other entities
Validity times for properties
A ranking mechanism for statements

Drawbacks of knowledge graph
Engineering the structure of a knowledge graph (an ontology) is difficult and subjective,
Normalizing Temporal Entities
it is a rule based approach.
Absolute temporal expressions
Relative temporal expression
Domain Dependence for Relative Temporal Expression
For relative temporal expressions, a reference time is necessary, which requires domain knowledge to retrieve.
News-style texts: Use publication metadata
Narrative texts: use preceding information in paragraph
Implicit Networks
Relations to entities have scores associated to them
Networks are aggregated

NER Applications
