1/19
Flashcards covering key vocabulary related to Artificial Intelligence, Large Language Models, and their implications for human language and cognitive science, based on the LING 1010 lecture notes.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Chatbot
A computer program designed to simulate human conversation, typically by receiving a text prompt input from a human user and returning a text response. Early chatbots often relied on scripted rules and pattern matching.
ELIZA
A historical chatbot created in the 1960s by Joseph Weizenbaum at MIT. It was designed to simulate a conversation with a Rogerian therapist by identifying keywords and rephrasing user input as questions, demonstrating early principles of human-computer interaction.
Large Language Models (LLMs)
A recent and significant advance in Artificial Intelligence (AI) that represents a new approach to language production. Unlike traditional methods that use symbolic rules or grammars, LLMs are trained on vast datasets of text and code, allowing them to learn complex patterns and generate human-like text without explicit symbolic programming.
Artificial Intelligence (AI)
A field of computer science and an engineering project focused on building computers and machines capable of exhibiting intelligence. This typically involves developing systems that can carry out tasks traditionally associated with human cognitive abilities, such as reasoning, problem-solving, learning, perception, and language understanding.
Cognitive Science
An interdisciplinary scientific field that investigates information processing in the human brain and mind. It draws on theories and methods from psychology, linguistics, computer science, neuroscience, philosophy, and anthropology to study processes like perception, reasoning, problem-solving, language, and memory.
Linguistics
The branch of cognitive science specifically dedicated to the scientific study of language. It examines the structure, meaning, acquisition, and use of language in humans, exploring its universal properties and diversity across different cultures.
Classical Computational Theory of Mind
A prominent school of thought in Cognitive Science that posits the human mind functions like a digital computer. It suggests that mental processes involve carrying out rule-governed computations on symbolic representations, much like a computer program processes data.
Connectionism
A school of thought in Cognitive Science that models the human mind as a product of the human brain's neural architecture. Inspired by the wiring of neurons, connectionist models (like neural networks) process information through distributed patterns of activity across nodes, without relying on explicit symbols or rules.
Natural Language Processing (NLP)
The branch of Artificial Intelligence (AI) that specifically focuses on the interactions between computers and human (natural) language. Its goal is to enable computers to understand, interpret, and generate human language in a valuable and meaningful way.
Language Model
Any NLP program designed to predict the next word in a sequence, given the preceding words as input. These models operate by calculating probabilities for different words based on statistical patterns observed in their training data.
Generative Pre-trained Transformer (GPT)
An acronym standing for a type of large language model (e.g., GPT-4) based on the Transformer architecture. These models are 'generative' because they can produce novel text, 'pre-trained' on vast text corpora, and use 'Transformer' neural networks with self-attention mechanisms for processing language sequences efficiently.
Artificial Neural Networks (ANNs)
Computational models, like those used by large language models, inspired by the structure and function of biological neural networks in the brain. They are not explicitly programmed with rules but learn to solve problems by being 'trained' on large datasets, adjusting connections between artificial neurons.
Parameters (ANNs)
The numerical weights and biases within an Artificial Neural Network that determine the strength and influence of connections between neurons. These parameters are adjusted incrementally during the training process (e.g., using backpropagation) to optimize the network's ability to produce desired outputs.
Benchmarking
A standardized method used to systematically evaluate the capabilities and limitations of large language models across various linguistic tasks. It often involves testing models against curated datasets like BLiMP to quantitatively score their performance on specific aspects of language understanding and generation.
BLiMP (The Benchmark of Linguistic Minimal Pairs)
A large-scale resource widely used for scoring and evaluating the grammatical understanding of language models. It consists of 67 carefully constructed 'minimal pairs' of sentences, each designed to test a specific grammatical contrast by varying only a single linguistic feature.
Surprisal
A quantitative measure used in evaluating language models that reflects how unexpected a word or sequence of words is in a given context. A lower surprisal value indicates that the word or sentence is more probable and thus grammatically plausible according to the model's predictions.
Neural Network Probing
A developing field of research that employs various techniques to attempt to understand and interpret what is happening inside the complex internal layers of a neural network. This endeavor is crucial because experts currently struggle to fully interpret their inner workings, making them somewhat opaque.
Black Boxes (LLMs)
A term commonly used to describe Large Language Models due to the significant difficulty in interpreting their inner workings. Their complex, non-linear architectures and vast number of parameters make it challenging to understand precisely how they arrive at their outputs or make specific decisions.
Empiricists (language acquisition)
A view on language acquisition that suggests humans learn language primarily through environmental input and general cognitive learning mechanisms. It proposes that no innate, language-specific learning biases are required, and a general-purpose learning device, exposed to sufficient data, would suffice for language acquisition.
Stochastic Parrot
A skeptical characterization suggesting that an LLM is merely a sophisticated system for 'stochastically' (randomly, based on probabilities) stitching together sequences of linguistic forms observed in its massive training data. This view argues that LLMs reproduce patterns without genuine comprehension, understanding, or reference to meaning, essentially mimicking language without true intelligence.