1/17
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Tokenizers
A fundamental tool in Natural Language Processing (NLP) that breaks down raw text into smaller, manageable units called tokens. It determines what LLMs “see”
How are tokenizers made?
Byte pair encoding (BPE)
Make every character its own token
Find the most frequent pair of tokens and merge to create a new token.
Repeat over and over until you reach a vocabulary of a predetermined size.
arguments against concepts being in an metric space
How to determine if something is in a metric space
Distance is symmetric: d(x, y) = d(y,x)
“Motels are like hotels” - “Hotels are like motels” ≠ 0
“Dogs are like wolves” - “Wolves are like dogs” ≠ 0
Triangle inequality: Any three points make a triangle. Every line between each point is the shortest distance between two points. d(x,z) ≤ d(x,y) + d(y,z)
d(king, woman) ≤ d(king, man) + d(man, woman) is not true
↳ But, maybe concepts are vectors.
loss functions
the pareidolia argument against LLMs (17)
idea of intelligence as innovation and recombination knowledge on rapid timescales (17)
What is ARC and why do people like it? (17)
Be able to articulate an opinion on AI as stochastic parrots! (17)
language models as agent models
foundation models of behavior (Centaur)
transformer architecture
tokens
embeddings
keys (what I know)
queries (what I need)
values (my information)
compositionality in transformers
word learning in transformers (and limitations)
digital twin studies (including object and social behavior + criticisms)
AGI through deep learning is intractable
examples of deep learning models learning shortcuts
What makes a model good?
Idea that good model fits don’t imply that DNNs are like the brain