transformers/gen ai

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/11

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

12 Terms

1
New cards

sequence modeling

  • decoder-only, GPT-style

  • learn to predict the next token in a sequence

  • ex: language modeling, music generation, etc.

  • formula: P(x) = P(x1) * P(x2 | x1) * P(xt | x<t)

2
New cards

sequence-to-sequence (seq2seq)

  • encoder-decoder (original transformer), translation-style

  • input sequence (english sentence) → output sequence (german translation)

  • ex: translation, q&a, text-to-speech

  • formula: P(x|z) = P(x1|z) * P(x2|x1, z)

3
New cards

classification

  • encoder-only, BERT-style

  • input = sequence of tokens → output = label/class

  • ex: sentiment analysis, spam detection

  • formula: learn P(c|x)

4
New cards

transformers

  • deep feed-forward neural networks that rely on attention mechanisms

  • general purpose sequence models with 3 main use cases:

    • sequence modeling

    • seq2seq

    • classification

5
New cards

tokenization

  • process of representing text as tokens

  • subword tokenization is most common

  • each token converted to unique integer ID

6
New cards

token embedding

  • converts token ID → vector

  • like a dictionary lookup into a learned matrix

7
New cards

positional embedding

  • adds information about word order

  • without it, model sees text as ‘bag of words’

  • can be learned (finite length) or sinusoidal (infinite length)

8
New cards

final/vector embedding

token embedding + positional embedding

9
New cards

attention

decides which parts of the sequence to focus on

10
New cards

attention score

similarity between query and key

11
New cards

multi-head attention

  • Multiple attention mechanisms run in parallel, each capturing different relationships.

  • Outputs are concatenated and linearly combined

12
New cards