1/122
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Members of the Dartmouth summer research project on artificial intelligence in 1956
Nathaniel Rochester, Ray Solomonoff, Marvin Minsky, John McCarthy, Claude Shannon
algorithm
a set of instructions or actions that are followed in a sequence that solve a problem or perform a task
perceptron
to combine weighted inputs and apply an activation function to produce an output
adds up inputs. if resulting sum is equal to or greater than the perceptron’s threshold, the perceptron fires (=1). otherwise, the perceptron doesn’t fire (=0).
the foundation for neural networks based on the human brain
mimcs the neuron
hallucination
when AI “perceives patterns or objects that do not exist or are imperceptible to humans, which create outputs that are inaccurate or nonsensical” but presents it confidently
who coined the term “artificial intelligence”?
John McCarthy
cybernetics
control and communication in animals and machines with focus on feedback and system dynamics
who led cybernetics
norbert wiener
scholars’ goal in 1956
to establish a research program in artificial intelligence
to develop a genuine thinking machine that mirrored human thinking, i.e., artificial general intelligence
marvin minsky
founded MIT’s AI lab focused on symbolic approaches to AI
according to minsky, intelligence is
a suitcase term with many different meanings
minksy’s focus was on
human symbolic manipulation and ways to mimic that via computers
minsky’s central features of intelligence
search
pattern recognition
learning
planning
inductive reasoning
who invented the perceptron
in the late 1950s by psychologist Frank Rosenblatt`
symbolic AI
uses words/phrases, i.e., symbols along with rules, which are combined and processed by the program to perform an assigned task
rule-based
expert systems
who invented symbolic AI
allan newell and herbert simon
first symbolic AI systems
general problem solver
program coded rules for solving logic problems based on human reasoning
foundation of “expert systems” rule-based programming
NSS chess program
used algorithms that searched for good moves and “heuristics” of known chess strategies
advanced early “decision tree’ algorithms where improbable options are “pruned”
what is a hidden layer in a multilayer neural network
perceptrons that are neither input nor output units
a connectionist network is
the weighted connections between units
back propagation
when a neural network pushes errors in the output back into the network to find which layer caused the errors
explainable AI helps AI to be more
fair, accountable, and transparent (FAT)
example of sub-symbolic system
multilayer neural networks
syb-symbolic AI systems
neural networks
reinforcement learning
AI winter
mid 1970s-90s b/c research in symbolic systems did not achieve the promises of general AI
who invented connectionism
UCSD David Rumelhart and James McClelland
connectionist network
the key to generative AI lays in the weighted connections between units
the key to general AI
a computational architecture that’s like the brain and can learn from data
multilayer networks = deep neural networks
networks with more than 1 hidden layer
threshold = bias
the measure of how easy it is to get a unit to output a 1. the lower the bias, the harder to fire
classification
a neural network’s prediction
forward propagation
each unit passes its sum of weights from the input or prior hidden unit to the next hidden unit or the output
“learning”
modifying the weights to reduce error and increase prediction accuracy
black box problem
when machines “learn”, we don’t usually know why they provide the output they do
explainable AI
in the 1980s there was a push for scientifically grounded principles to make AI systems “self-explaining”
core ethical principles of AI
fairness
accountability
transparency
interpretability
trustworthiness
Turing Test
a challenge that measures whether a computer can chat with a human judge such that the human judge thinks they are talking with another human
strong AI = general AI = artificial general intelligence = superintelligence
an AI system that truly understands and has actual human-like intelligence and can do a variety of tasks
exponential growth as it relates to computer
computers get faster and faster every year with advances in computer chips and processing
the singularity
by 2045, computers will surpass human intelligence
humans and AI will merge to form an immortal superintelligence
prompt engineering
the practice of designing inputs to AI to get the desired output
Alan Turing
codebreaker in WWII who wrote a paper about machine intelligence in 1950 saying the machine could be considered intelligent if it could convince human judges that they were talking with another human
weak AI = narrow AI
can only do a specific task (i.e. draw images or drive a car)
John Searle
philosopher who introduced the concepts of strong and weak AI
weak AI: AI that simulates a human mind but does not have one
strong AI: AI that actually has a mind
data centers
where compute happens
need electricity to run computers to run calculations to train models and then calculate responses to prompts
prompt engineering
careful designing of instructions to generative AI to get the output desired
metacognitive process: need to think about what you want as output; relevant factors (audience, tone); complexity; etc
semantic space
a word’s meaning is understood by its occurrence with other words
natural language processing
getting computers to deal with robot language
for computers to process human language, they first have to
convert words to numbers
1 key advance of BERT
has an “attention mechanism” that allows the model to more accurately represent the meaning of a word in a sentence
word2vec
a neural network model that represents words as vectors
word embedding, i.e., helps determine meaning of words via context
captures word meanings, similarity to other words, relationships with surrounding text
information retrieval = question-answering systems are thanks to
the presence of large amounts of writing on the internet
current AI revolution of chatbots and LLMs is thanks to
neural network computing, especially recurrent neural networks
Paul Werbos 1980 Long Term, Short Term Memory; Hochreiter & Schmidhuber 1997
the presence of large amounts of writing on the internet
word2vec
recurrent neural networks (RNN)
multilayer neural networks that use backpropagation to “remember” inputs. language is sequenced, so there needs to be memory
feedforward neural networks (FFNN)
don’t “remember” because they don’t use back propagation
who created Word2Vec (word to vector)
Google research team including Geoffrey Hinton and Ilya Sutskever in 2013
tokenization
turning words in a corpus into numbers; part of word embedding
vectorizing
multi-dimensional numeric representation of words; synonyms and associated words mapped nearby in vector space
BERT
bidirectional encoder representations from transformers
trained on Google Books and Wikipedia
implemented in Google Search in 2019
who invented BERT?
the Google “Brain” team led by Jacob Devlin in 2018
breakthrough of BERT
RNNs only use backpropagation while transformers can use both left and right contexts across all units (i.e., the hidden layers)
transformers contextualize a given token within a “context window” where an “attention mechanism” amplifies important tokens and diminishes less important tokens = encoders
encoders track relationships between words in sentences and the sequences of words
also masks words to test-predict accuracy of relationships between words
tensor processing units (TPU)
special computer chips designed to accelerate complex calculations for machine learning and NLP
OpenAI advancement is GPT
built on transformers architecture
generative pre-trained transformer
GPT is primarily a decoder focused on next word prediction
“generative” — trained to create new content
true LLM
mental model
the model that helps humans understand how the world works
shots
references or examples in prompt engineering
chain of thought prompting
you have a complex task in mind and you coach the AI through the steps of that task
guides an AI model to solve complex problems by breaking them down into a series of logical, intermediate steps before arriving at a final answer
what computational approach did watson use to win jeopardy
question answering
who developed Watson
IBM
adversarial attacks
the ability to trick a computer into outputting the wrong answer
intentionally tricking a computer system to output an incorrect answer without human detection
answer extraction = information retrieval = question-answering system = search
questions get turned into a search query and extracted from a large database of information
information retrieval
the computer science term for the engineering behind QA systems
1980s-2000s difference between IBM and Google competition
competed for technical dominance in information retrieval and AI
IBM focused on QA challenges
Google focused on commercial search
when was IBM founded
1911 as a tabulating company (data processing)
IBM’s AI focus
created DeepBlue to compete at chess, beat world chess champion Gary Kasparov in 1997 using a rule-based approach
built Watson (named after IBM company founder) and competed on Jeopardy! in 2011 using an information retrieval approach
when was Google founded
1997 by Larry Page and Sergey Brin from Stanford to compete with Yahoo!
Google’s goal vs Yahoo
to create a better WWW search experience
Yahoo! indexed pages by topics
Google developed an algorithm to rank importance of Web pages based on in-links
IBM vs Google
IBM hasn’t innovated further beyond Watson’s QA
Google pushed AI innovation
Word2Vec for NLP
tensor processing units and tensor flow software for efficient computations
transformers as encoder-decoder architectures for NLP
Bard as Google’s first LLM
layers in a convolutional neural network are comprised of
activation maps
how does the brain visually process objects?
feed forward flow of information
inputs from the eye are decomposed into small units
the brain processes those units into key elements and then analyzes them hierarchically
when the brain processes objects in its visual field, it processes the information
in a hierarchical series that starts with edges and ends in recognizing the object
convolution in CNNs are
the mathematical calculation used by computers to define features of an object
a calculation that multiples the values in a receptive field by its corresponding weight and summing the results
neocognitron
one of the earliest neural networks trained for computer vision
neural networks mimic…
the brain
computer neural networks do what for visual processing
break images into small pixels of light and convert them to numbers for computational processing
computers create hierarchies for processing visual information for object recognition
what are CNNs trained to do?
to focus on specific aspects of the image to focus on in order to learn how to recognize objects
compute = computation = processing
mathematical calculations computers must run to arrive at their outputs, usually expressed as probabilities
who created LeNet
Yann LaCunn
what does LeNet do
recognize handwriting of numbers
why was ImageNet a breakthrough in image recognition?
it provided a massive dataset of labeled images for training and testing AI
why was Amazon Mechanical Turk important for image recognition?
human workers on that platform were able to categorize millions of images and improve the training data
what is WordNet’s relationship to ImageNet
provides the nouns that were used as the categories to label images
where did Fei Fei Li get the images for ImageNet from
Flickr and Google Image Search
who invented convolutional neural networks?
Yann LeCunn in 1989 demonstrated the ability for computers to decode handwriting
who developed ImageNet?
Fei Fei Li in 2011 with a massive dataset of labeled images
who developed AlexNet
Geoffrey Hinton, Alex Krizhevsky, and Ilya Sutskever from UToronto
when did AlexNet beat IBM and others in ImageNet image recognition competition
2012
when did Hinton, Sutskever, and Krizhevsky leave UToronto for Google
2013
Fei Fei Li
Princeton to Uni of Illinois computer vision researcher
2007-2010, downloaded a billion images from publicly available photo platforms like Flickr
categorized the images on 22,000 categories with Amazon Mechanical Turk
where do ImageNet categories come from
WordNet
who developed WordNet
Miller and Fellbaum at Princeton in 1985 organized 155,000 words into synsets (word-sense pairs)
synsets
categories of words and their attributes (associations)
Clever Hans
a story about a horse who seemingly could do math
represents learned shortcuts (spurious correlation) in AI: an irrelevant pattern in data that happens to correlate with the right answer will be learned by the AI resulting in incorrect predictions