Quiz 7 - Sequential Neural Nets

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/22

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

23 Terms

1
New cards

Who introduced the Transformer model in 2017?

Vaswani et al.

2
New cards

What is BERT used for?

Masked language modeling and classification tasks.

3
New cards

Name two families of large language models.

GPT and Llama families.

4
New cards

What is tokenization?

Breaking input text into subword units for efficient processing.

5
New cards

How are tokens converted into vectors?

Using embedding algorithms like word2vec.

6
New cards

What is positional encoding?

A method to incorporate the order of tokens into a sequence model.

7
New cards

How do RNNs handle sequential data?

By looping connections to process input one timestep at a time.

8
New cards

When do you compute loss only from the last unit of an RNN?

For tasks requiring a summary output, like classification.

9
New cards

What is a bidirectional RNN?

An RNN that processes input in both forward and backward directions.

10
New cards

How does a gated RNN differ from a standard one?

It uses gates to control which information is stored or discarded, enhancing memory capabilities.

11
New cards

What connects the encoder and decoder in this architecture?

A fixed-length context vector summarizing the input sequence.

12
New cards

Name a task where encoder-decoder architecture is used.

Language translation.

13
New cards

What does the attention mechanism calculate?

A weighted sum of information from previous tokens based on similarity.

14
New cards

What are query, key, and value vectors in attention?

Transformed representations of token vectors for computing relevance.

15
New cards

What is an attention head?

A unit that computes attention for a specific aspect of input.

16
New cards

What components are inside a transformer block?

Attention layers, feed-forward networks, and residual connections.

17
New cards

Why are residual connections used in transformer blocks?

To preserve the original input information alongside processed data.

18
New cards

What is the difference between masked and auto-regressive models?

Masked models (like BERT) consider all input tokens at once, while auto-regressive models (like GPT) process tokens sequentially.

19
New cards

What is the purpose of fine-tuning an LLM?

To adapt the model for specific tasks using a pre-trained base.

20
New cards

What is prompt engineering?

Crafting inputs to guide LLMs toward desired outputs.

21
New cards
22
New cards
23
New cards