Quiz 7 - Sequential Neural Nets

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/22

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

23 Terms

New cards

Who introduced the Transformer model in 2017?

Vaswani et al.

New cards

What is BERT used for?

Masked language modeling and classification tasks.

New cards

Name two families of large language models.

GPT and Llama families.

New cards

What is tokenization?

Breaking input text into subword units for efficient processing.

New cards

How are tokens converted into vectors?

Using embedding algorithms like word2vec.

New cards

What is positional encoding?

A method to incorporate the order of tokens into a sequence model.

New cards

How do RNNs handle sequential data?

By looping connections to process input one timestep at a time.

New cards

When do you compute loss only from the last unit of an RNN?

For tasks requiring a summary output, like classification.

New cards

What is a bidirectional RNN?

An RNN that processes input in both forward and backward directions.

New cards

How does a gated RNN differ from a standard one?

It uses gates to control which information is stored or discarded, enhancing memory capabilities.

New cards

What connects the encoder and decoder in this architecture?

A fixed-length context vector summarizing the input sequence.

New cards

Name a task where encoder-decoder architecture is used.

Language translation.

New cards

What does the attention mechanism calculate?

A weighted sum of information from previous tokens based on similarity.

New cards

What are query, key, and value vectors in attention?

Transformed representations of token vectors for computing relevance.

New cards

What is an attention head?

A unit that computes attention for a specific aspect of input.

New cards

What components are inside a transformer block?

Attention layers, feed-forward networks, and residual connections.

New cards

Why are residual connections used in transformer blocks?

To preserve the original input information alongside processed data.

New cards

What is the difference between masked and auto-regressive models?

Masked models (like BERT) consider all input tokens at once, while auto-regressive models (like GPT) process tokens sequentially.

New cards

What is the purpose of fine-tuning an LLM?

To adapt the model for specific tasks using a pre-trained base.

New cards

What is prompt engineering?

Crafting inputs to guide LLMs toward desired outputs.

New cards

Explore top notes

Carbohydrates

Updated 962d ago

Note

Chapter 8: Analyzing Cells, Molecules, and Systems

Updated 678d ago

Note

Chapter 8: The Power and Limits of a Market

Updated 873d ago

Note

unit 1 pt 2 psych slides/notes

Updated 156d ago

Note

REQUIRED READING: Broken Branch Article

Updated 59d ago

Note

Chapter 33: Irritant Poisons

Updated 770d ago

Note

How to Write an LEQ

Updated 918d ago

Note

AP Biology Ultimate Guide

Updated 397d ago

Note

Explore top flashcards

Asuntosanasto (22.1.2024)

Updated 408d ago

Flashcards (34)

Chemistry - Unit 1

Updated 131d ago

Flashcards (88)

Chem 100 ch.1 and ch.2

Updated 597d ago

Flashcards (108)

Presidential Race

Updated 803d ago

Flashcards (27)

Capítulo 4 Vocabulario

Updated 936d ago

Flashcards (97)

Biology PHall Red - Chapter 03 The Biosphere

Updated 385d ago

Flashcards (39)

Avant reading vocab (basic)

Updated 38d ago

Flashcards (40)

Chemistry - Ch 9 Vocab

Updated 575d ago

Flashcards (32)