1/8
Flashcards about Large Language Models
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
What is the basic function of a Large Language Model (LLM)?
A very smart guesser that predicts the next word in a sequence.
How are LLMs trained?
By reading billions of sentences from books, websites, and code to find patterns in how words follow each other.
How do models learn using weights and tokens?
Words are turned into tokens (numbers), which then go through layers of math (transformers) that tweak values based on context and meaning; these tweaks are controlled by weights adjusted during training based on prediction errors.
What is the attention mechanism in LLMs?
It allows LLMs to focus on important words in the prompt, making them better than older models like RNNs or LSTMs.
Why do LLMs feel smart?
They mimic style, structure, and facts from training data through pattern-matching, enabling them to generate various types of text.
What do LLMs use to predict the next word?
Statistical patterns from massive datasets, tokenized inputs, transformer layers, and attention mechanisms.
How are LLMs trained?
Billions of examples and feedback from prediction errors.
What is the basis of an LLM's excellence?
Surface-level prediction through deep pattern recognition, not reason or understanding.
Where does the power of LLMs lie?
Scale, not sentience.