Cognitive Modelling termen h1-7

5.0(2)

Studied by 17 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/72

There's no tags or description

Looks like no tags are added yet.

Last updated 2:59 PM on 4/23/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

73 Terms

New cards

Nativists

Individuals who believe that certain skills or abilities are innate and not acquired through experience. The idea that we are a “mechanical mind” of inputs and outputs.

New cards

Empiricist

That experience is the foundation of all knowledge and that when we are born, we are born with a clean slate (Tabula Rasa). Furthermore that the mind could be broken down into elements that when combined produced the whole of consciousness.

New cards

Natural selection

The process of which a species evolves by adapting to it’s environment. Learning is also an

evolved mechanism.

New cards

The brain

The brain originally existed for movement, but because of an arms race between carnivores and herbivores it evolved adaptive behavior. The brain is a prediction engine.

New cards

The neuron

A cell that caries signals throughout the brain. Consisting of dendrites as input, the cell body, the axon and the synapse as output.

New cards

Glutamate

Is excitatory, activating receptors, tends to increase the likelihood of the postsynaptic neuron firing. Most common neurotransmitter.

New cards

GABA

Is inhibitory (slowing down), activating receptors that tend to decrease the likelihood of the postsynaptic neuron firing.

New cards

Synaptic plasticity

The ability of synapses to change as a result of experience.

New cards

long-term potentiation (LTP)

A process in which synaptic transmission becomes more effective as a result of recent activity.

New cards

long-term depression (LTD)

A process in which synaptic transmission becomes less effective as a result of recent activity.

New cards

Hebbian Learning

The principle that learning involves strengthening the connections of coactive neurons; often stated as, “Neurons that fire together, wire together.”

New cards

Synaptogenesis

After birth the neurons are in place but still make new many new connections

New cards

Pruning

Many of these new synapses will be eliminated -> experience based fine tuning of functional networks

New cards

Latent Learning

Automatic Statistical Analyses of the World

New cards

Habituation

A decrease in the strength or occurrence of a behavior after repeated exposure to the stimulus that produces that behavior.

New cards

Dishabituation

A renewal of a response, previously habituated, that occurs when the organism is presented with a novel stimulus.

New cards

Spontaneous recovery

Reappearance (or increase in strength) of a previously habituated response after a short period of no stimulus presentation.

New cards

Sensitization

A phenomenon in which a salient stimulus (such as an electric shock) temporarily increases the strength of responses to other stimuli (including the habituated stimulus)

New cards

mere exposure learning

Combination of habituation (to similarities) and Sensitization (to differences).

New cards

synaptic depression

A reduction in synaptic transmission; a neural mechanism underlying habituation.

New cards

homosynaptic

Occurring in one synapse without affecting nearby synapses.

New cards

Long-term habituation

Elimination of presynaptic terminals

New cards

Hippocampus

It converts short-term memories into long-term memories by organizing, storing and retrieving memories within your brain. Your hippocampus also helps you learn more about your environment (spatial memory), so you’re aware of what’s around you, as well as remembering what words to say (verbal memory).

New cards

Spatial Learning

Making cognitive maps used for learning & planning (replay and pre-play) of for example a maze.

New cards

Dog of Pavlov

Classical Conditioning, when a dog is conditioned that a bell means food it begins to salivate when it hears a bell. With Unconditioned response (UR): A response for which no training was necessary to establish it.

Unconditioned stimulus (US): A stimulus that elicits a response without training.

Conditioned response (CR): A response whose occurrence depended on particular conditions of training.

Conditioned stimulus (CS): A stimulus that, through training, elicits a response.

Conditioning leads to general effect, extinction to context specific changes

New cards

Rescorla Wagner

Formulas for emulating conditioning:

Vt+1=Vt+dVt

dVt = a(Vmax – Vt)

V is the association made with the stimulus and a is the learning rate.

New cards

Latent inhibition

Part that Rescorla Wagner doesn’t explain. That prior overexposure to a stimuli makes it harder to learn a conditioned response from that stimuli.

New cards

Pearce-Hall

Formulas that try to fix latent inhibition with a dynamic learning rate.

Vt+1=Vt+dVt

dVt = S * at* Vmax

at = |Vmax - Vt| (t >= 1)

S is the intensity of the CS

New cards

instrumental condition

In which an animal learns how its own behavior is instrumental in causing specific consequences.

New cards

Law of effect

The observation that the probability of a particular behavioral response increases or decreases depending on the consequences that have followed that response in the past.

New cards

discriminative stimulus

A stimulus that signals whether a particular response will lead to a particular outcome.

New cards

reinforcement

The process of providing outcomes for a behavior that increase the probability of that behavior occurring again in the future.

New cards

free-operant paradigm

An operant conditioning paradigm in which the animal can operate the experimental apparatus “freely,” responding to obtain

reinforcement (or avoid punishment) when it chooses.

New cards

primary reinforcer

A stimulus, such as food, water, sex, or sleep, that has innate biological value to the organism and can function as a reinforcer.

New cards

secondary reinforcer

A stimulus (such as money or tokens) that has no intrinsic biological value but that has been paired with primary reinforcers or that provides access to primary reinforcers.

New cards

drive reduction theory

The theory that organisms have innate drives to obtain primary reinforcers and that learning is driven by the biological need to reduce those drives.

New cards

Homeostasis

compensatory responses

New cards

continuous reinforcement schedule

A reinforcement schedule in which every instance of the response is followed by the consequence.

New cards

partial reinforcement schedule

reinforcement schedule a reinforcement schedule in which only some responses are reinforced.

New cards

dorsal striatum

Plays a critical role in operant conditioning

New cards

orbitofrontal cortex

Is, among other regions, involved in tracking the outcomes of behavior.

New cards

fixed-ratio (FR) schedule

Reinforcement after an predictable number of responses

New cards

variable-ratio (VR) schedule.

Reinforcement after an unpredictable number of responses

New cards

fixed-interval (FI) schedule

Reinforcement after a specified amount of time

New cards

variable-interval (VI) schedule

Reinforcement after an unpredictable amount of time

New cards

Operant conditioning

A type of learning in which behavior changes based on its consequences. Reinforcement strengthens a behavior, while punishment reduces the likelihood of it occurring again. The timing and consistency of reinforcement or punishment play a key role in how quickly and effectively learning happens.

New cards

hedonic value

The subjective “goodness” or value of a reinforcer.

New cards

motivational value of a stimulus

The degree to which an organism is willing to work to obtain access to that stimulus.

New cards

insular cortex (insula)

A region involved in conscious awareness of bodily and emotional states and may play a role in signaling the aversive value

of stimuli.

New cards

dorsal anterior cingulate cortex (dACC)

A subregion of prefrontal cortex that may play a role in the motivational value of pain.

New cards

Dopamine

Neurotransmitter that’s important in reinforcement.

New cards

Temporal Difference Learning

Temporal Difference (TD) Learning is a model-free reinforcement learning method used by algorithms like Q-learning to iteratively learn state value functions (V(s)) or state-action value functions (Q(s,a)). Rescorla-wagner has not way to deal with time, TD has. By learning updates value estimates after each time step using the Bellman equation and Temporal Difference error. Basically that thing you did in week 1 with that reward.

New cards

Markov Decision Process

The Markov assumption poses that the world is composed of a set of finite states (S), actions

(A), a set of transition probabilities (T), and rewards (R). When you enter one state, it does not matter where

you came from with regards to predicting the future outcome of your actions in terms of transitions

New cards

Q-Learning

A model-free reinforcement learning algorithm that helps an agent learn how to make the best decisions by interacting with its environment. Instead of needing a model of the environment the agent learns purely from experience by trying different actions and seeing their results

New cards

model based

learning a ‘‘model’’ of action-outcome relationships and use this to plan actions by forecasting their outcomes. A mouse that stops pulling a lever, even if it associates it with food, when he gets food that makes him sick. Another example is AlphaGo, Q-learning + Deep Neural Nets & Tree search.

New cards

model free

Learning a direct mapping from perceptual inputs to action outputs. A mouse that keeps pulling a lever because it associates it with food as a habit, even though that food makes him sick.

New cards

explore-exploit problem

Finding the right balance between exploring or exploiting a resource to optimize (speed up) learning and

maximize rewards.

New cards

epsilon-Greedy policy:

With probability epsilon chose a random bandit, otherwise choose the bandit with highest expected value. keeps exploring even when

everything is explored. Not always optimal, not optimal at all stages of the task.

New cards

Softmax

allows the current estimates of Q (s, a) to influence the probability of exploration:

New cards

Upper Confidence Bound

Optimism in the face of uncertainty. Exploration bonus and directed exploration.

New cards

Prediction error

The difference between the predicted values made by some model and the actual values.

New cards

The SASRA algorithm

(State, Action, State, Reward, Action) takes into account all possible actions the agent will take in the next step.

TM = rt+1 + C E(Q(st+1, at+1))−Q(st ,at )

E(Q(st+1, at+1)) = e E(pi(random)) + (1-e) * E(pi(Qmax))

New cards

Learned Helplessness

The concept that you can learn to be helpless:
In group one, dogs were strapped into harnesses temporarily and then released. In group two, dogs were harnessed and subjected to electric shocks, which they could avoid by pressing a panel with their noses. In group three, dogs also received shocks, but unlike group two, they couldn't control the shocks. For this group, the shocks appeared random and beyond their control. When placed in the shuttlebox again, dogs from the first two groups quickly learned that jumping the barrier stopped the shock. However, those in the third group did not attempt to avoid the shocks.

New cards

DQN

Basically if you merged Q-learning and neural nets. It’s especially useful in environments where the number of possible situations called states is very large like in video games or robotics.

New cards

catastrophic forgetting

The tendency of an artificial neural network to abruptly and drastically forget previously learned information upon learning new information.

New cards

Expected Utility

How much you expect to get from a situation. If you expect 100 dollars, but there is a 5 % change you wont get it, it’s expected utility is 95 dollars and a disappointed feeling when you don’t get it. It’s similar to the Q-values. Utility shows the satisfaction or happiness derived from a good/service/money while the expected value simply shows us the monetary value.

New cards

Expected Value

A generalization of the weighted average. So if there is a 0.5 chance of winning 10 dollars and a 0.5 chance of getting 20 dollars the expected value is 15 dollars.

New cards

Weber-Fechner Law

Weber’s law, also known as Weber-Fechner law, explains that perception of intensity of a stimulus grows at a slower rate than the actual physical intensity. When dealing with intense stimuli (high initial intensity), larger changes are necessary for stimulus discrimination detection than those with less sensitive ones. This is also a rule that is applicable in expected utility.

New cards

Prospect Theory

Rejects the expected utility hypothesis and states that probabilities are also not linearly weighted.

New cards

Discounted Utility

That the subjective worth shrinks as a function of approximately 1/t when there is more time between now and getting the reward. For example that 35 euro in 210 days is worth less than 100 euro in 270 days but 35 euro today is worth more than 100 euro in 60 days.

New cards

Pavlovian System

One decision-making systems of multiple systems that can semi-independently drive decision-making:

Although diverse stimuli can participate in Pavlovian learning, the available actions remain

limited (e.g., salivate, approach, avoid, freeze). Example: pavlov dogs salivating

New cards

Habit System

One decision-making systems of multiple systems that can semi-independently drive decision-making:

The habit system entails an arbitrary association between a complexly recognized situation and a complex chain of actions (Typical Q-RL). Once learned, cached actions are fast but can be hard to change.

New cards

goal-directed system

One decision-making systems of multiple systems that can semi-independently drive decision-making:

Deliberative action-selection is a complex process that includes a search through the expected consequences of possible actions based on a world model. It is very flexible, it is computationally expensive and slow.