Name Entity Extraction and Linking

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/18

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 6:29 PM on 6/4/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

19 Terms

1
New cards

Event

Linguistic: verb

Information Retrieval: a relation of entities

2
New cards

Named Entity

A unique entity that is identified by its name. Can be of form:

  • Persons

  • Locations

  • Organizations

  • Dates

  • Times

  • Numeric expressions

  • Or domain specific

3
New cards

Information Extraction Systems Goals

  • Identify and understand relevant parts of texts

  • Gather, collate, and link information within and between documents in the corpus

  • Produce a structured representation of relevant information

  • Organize information so that it is useful to people

  • Store information in a semantically precise format that is usable by algorithms

4
New cards

IE Pipeline

  • Recognition of named entities

  • Extraction of relations between entities

  • Knowledge base population

5
New cards

Name Entity Recognition

The goal:Identify named entities in a document and tag them with a type

<p><span style="background-color: transparent;">The goal:Identify named entities in a document and tag them with a type</span></p><p></p>
6
New cards

Encoding Classes for NER

Inside-Outside (IO) encoding is less precise

Inside-Outside-Beginning (IOB) encoding is more precise but requires
a larger tagset

  • the benefit is limited in practice, so IO is often used

7
New cards

Features for NER

Token level:current/prev/next

Tag level:Inferred linguistic classification

Label Level:Previous (and perhaps next) named entity label in the current sequence

8
New cards

Substrings in NER

we can detect patterns of substrings in certain domains however there are exceptions

9
New cards

Token Shape

Some named entity names tend to follow patterns that can be mapped to a simplified representation based on attributes such as:

  • Token length,

  • Capitalization,

  • Numerals,

  • Greek letters,

  • Internal punctuation,

  • etc.

10
New cards

NER models

  • Markov Models

    • Conditional Markov Models make a single decision at a time,
      conditioned on evidence from observations and previous decisions

  • Conditional Random Fields

    • A whole-sequence conditional model, rather than a chain of local models

  • Deep learning models

    • Bidirectional Long Short Term Memory models (LSTMs)

    • Transformers

11
New cards

Handling Ambiguity: Normalization

reducing or rewriting something to a common (normal) form.

12
New cards

NE disambiguation

The task of deciding whether two entity mentions refer to the same entity.

13
New cards

NE linking

The task of linking an entity mention to a unique identifier.

14
New cards

Linking with knowledge graph

For persons, organizations, and locations, we typically link to an entry in a knowledge graph. The most commonly used resource is Wikidata.

Entities have

  • A unique Q-identifier

  • Properties that connect them to other entities

  • Validity times for properties

  • A ranking mechanism for statements

<p><span style="background-color: transparent;">For persons, organizations, and locations, we typically link to an entry in a knowledge graph. The most commonly used resource is Wikidata.</span></p><p><span style="background-color: transparent;">Entities have</span></p><ul><li><p><span style="background-color: transparent;">A unique Q-identifier</span></p></li><li><p><span style="background-color: transparent;">Properties that connect them to other entities</span></p></li><li><p><span style="background-color: transparent;">Validity times for properties</span></p></li><li><p><span style="background-color: transparent;">A ranking mechanism for statements</span></p></li></ul><p></p>
15
New cards

Drawbacks of knowledge graph

Engineering the structure of a knowledge graph (an ontology) is difficult and subjective,

16
New cards

Normalizing Temporal Entities

it is a rule based approach.

  • Absolute temporal expressions

  • Relative temporal expression

17
New cards

Domain Dependence for Relative Temporal Expression

For relative temporal expressions, a reference time is necessary, which requires domain knowledge to retrieve.

  • News-style texts: Use publication metadata

  • Narrative texts: use preceding information in paragraph


18
New cards

Implicit Networks

Relations to entities have scores associated to them

Networks are aggregated

<p>Relations to entities have scores associated to them</p><p>Networks are aggregated</p>
19
New cards

NER Applications

knowt flashcard image