L3 - Sequence Labelling and POS tagging

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/15

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

16 Terms

1
New cards

What is sequence labeling in NLP?

  • The task of assigning a label to each token in a sequence.

  • E.g.:

    • Part-of-Speech tagging (POS tagging)

    • Named Entity Recognition (NER)

    • Information extraction

2
New cards

Useful and challenge - sequence labelling

  • Useful: generalize information retrieval

  • Challenge:

    • Diff understand meaning from text

    • Word ambiguity

    • Context sensitivity

3
New cards

BIO tagging

  • Tagging format for sequences.

  • B: Begin

  • I: Inside

  • O: Outside

  • E.g.:

    The New York Times reported

    B-org: The

    I-org: New, York, Times

    O: reported

4
New cards

Sequence Label as classification

  • Classifier depends of words around

  • E.g.:

    seems like it

    if the goal is to class. “like” → check “seems” and “it”

5
New cards

Cons of Sequence Label as classification

  • Not possible to determine most likely of all tokens

  • cos does not model the dependencies between labels

  • Cannot change earlier decisions once made

6
New cards

What is a Hidden Markov Model (HMM)?

  • Probabilistic model

  • Predicts tag sequences based on state transitions.

knowt flashcard imageknowt flashcard image

7
New cards

Markov assumption

Each next step only depends on current state

8
New cards

What are the three main problems solved in HMMs?

Given a model λ and a sequence of observations О = O1 O2 … OT

Evaluation

What is the proba the observations are generated by the model? P(O|λ)

Decoding

What is the most likely state sequence in the model that produced the observations?

Learning

How should we adjust the model’s parameters in order to MAX P(O|λ)?

9
New cards

HMM Evaluation (Likelihood)

knowt flashcard imageknowt flashcard image

10
New cards

Forward Algorithm Complexity

O(TN2)

11
New cards

What algorithm is used for decoding in HMMs?

Viterbi Algorithm

12
New cards

Viterbi Algorithm

  1. find the best path of length t-1 to each state.

  2. extend each by 1 step to sj

  3. take the best option (Max) and save the best path

knowt flashcard image

13
New cards

Beam Search

  • Inexact

  • Keep only best k hypothesis at each step

knowt flashcard image

14
New cards

Diff Viterbi and Beam Search

  • Viterbi: Performs exact search (assumption) by evaluating all options.

  • Beam Search: faster but inexact. Avoids labeling some sequences

15
New cards

POS tagging

States = POS tags

16
New cards

HMM Learning - Supervised

  • Training instances with labeled tags

  • Learning with Maximum Likelihood Estimation (MLE)

  • Transition probabilities (aij)

    Count(qt=si,qt+1=sj)/count(qt=si)

  • Observation likelihood (bj(k))

    Count(qt=sj,oi=vk)/count(qt=sj)