1/44
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is the law of effect?
Proposed by Thorndike, given stimulus in the environment can elicit a variety of brhavioral responses
How does Thorndike describe S-R associations?
Based on relationship between “stimulus” (S) and “response” (R)
Satisfying response: R more likely to occur
Annoying response: R less likely to occur
What is the difference between reinforcement and punishment?
Reinforcement increases the frequency of a behavior
Punishment decrease the frequency of a behavior
Positive Reinforcement:
Addition of a stimulus causes behavior to become more frequent
Negative Reinforcement:
Removal of a stimulus causes behavior to become more frequent
Positive Punishment:
Addition of aversive stimulus causes behavior to become less frequent
Negative Punishment:
Removal of appetitive stimulus causes behavior to become less frequent
What is the Skinnerian method of shaping?
Shaping of Successive Association
Shaping occurs when responses that are increasingly similar to the goal response are gradually reinforced
What are Stimulus-Response (S-R) associations?
strengthened relationship with no learned association between response and outcome.
Reflexive and automatic
Not “goal-directed”
What are Action-Outcome (A-O) associations?
Subject understand the value of anticipated outcome
Goal Directed
What is Devaluation Procedure?
tests whether association is S-R or A-O
Works by making the reward harmful and tests whether subjects continue to work for reward (S-R) or
no longer reinforced by stimulus (A-O) after reward no longer has the same value
When do A-O associations occur? Before or after SR associations? Why?
AO association occur in shoerter training sessions
They often occur before SR association
AO requires more cognitive resources, once a strong association is made it becomes SR and frees up cognitive resources
When do SR associations occur? Before or after AO associations? Why?
SR associations are made when subject undergoes longer training sessions
Occur after AO associations
They help free up cognitive resources since they become more of a reflex
Which associations (SR or AO) is subject to devaluation?
AO
not SR
Which structure encodes SR association?
Dorsolateral Striatum (DLS)
Which structure encodes AO associations?
Dorsomedial Striatum (DMS)
What happens if a subject has damage to the DLS? What association will they show? WIll they show devaluation?
No longer have SR association
Subject will default to AO associations
Show devaluation
What happens if a subject has damage to the DMS? What association will they show? WIll they show devaluation?
No longer have AO association
Subject will default to SR association
Will not show devaluation
What did Olds & Milner experiment discover?
While aiming to stimulate the septum, they discovered the Medial Forebrain Bundle (MFB)
The — is an extremely powerful positive reinforcer
Medial Forebrain Bundle (MFB)
Why is the MFB highly associated with positive reinforcement?
The dopaminergic neuron axons in the MFB
What kind of axons are in the MFB?
dopamine and epinephrine
Where do axons of the MFB synapse?
Nucleus Accumbens (NAcc) and Dorsal striatum (DS)
Increasing the intensity of MFB stimulation increases what?
Reinforces efficacy of MFB stimulation
Increases dopamine release in the striatum
What did Wise & Stein ‘69 discover in their experiments?
Norepinephrine plays a role in reinforcement of behavior
What did researchers find that is important about norepinephrine?
Blockage of NE synthesis resulted decrease of reinforcing efficacy in MFB stimulation
NE moderates arousal and wakefulness
What is the Dopamine Theory of Reward (DTR)?
Neuroleptic drugs block dopamine receptors, making MFB stimulation less rewarding.
More drug = less MFB stimulation
Amphetamines stimulate dopamine release and make MFB stimulation more rewarding. More drug = more MFB stimulation
Relating to Dopamine Theory of Reward, can animals still enjoy pleasure of food even without dopamine? (T/F)
True
although that is contrary to what DTR states, it was found to be true in rodent experiments
What are some probelms with the dopamine theory of reward?
States that reward is only associated with dopamine release
Evident to prove DTR wrong?
Cannon & Palmiter: show that this is not true since animals with no dopamine release showed preference for reward instead of normal water as well as profound motor deficits
Salamone et al.: NAcc damage (no dopmaine) of MFB terminals affect animal motivation to search for reward but subjects still chhoose food when not presentedf with a barrier
What are the two dopamine hypotheses?
Reward Hypothesis: dopamine release occurs following instrumental behavior. Has a hedonic impact making instrumental behavior more likely to occur
Incentive Motivation Hypothesis: Dopamine release occurs preceding instrumental behavior. Dopamine release has nothing to do with reward or hedonic impact once received. Motivates behavior
What turned out to be the true relation between dopamine and reward?
Dopamine is based on learning expectation and dopamine fires in resonse to an unexpected positive reinforcer
What type of learning model is the relationship between dopamine and reward consistent with?
Rescorla wagner learning
The least probable reinforcer causes the most dopamine release
What are the variables of rescorla wagener and their meaning in dopamine and reward?
λ-ΣV = ΔV
λ: reinforcer
ΣV: Reinforcer expected
ΔV: New learning
What does dopamine do when a response is learned?
It turns into a maintenance signal
Dopamine occurs at the earliest cue of the reinforcer/ reward
What actually causes hedonic impact?
Endogenous opioids cause hedonic impact and drives reward
What are hedonic hotspots?
Pharmacological stimulation of opioid receptors in the brain increase “liking” response
Hedonic response
What are hedonic coldspots?
Pharmacological stimulation of opioid receptors in the brain decrease “liking” expressions
Anhedonic response
Why do rats press a lever for MFB stimulation?
MFB stimulation causes burst of dopamine release like in reward prediction error
Makes it seem like new novel experience, reinforcing behavior
How is reinforcement learning used in AI?
RL algorithms don’t need to have all the knowledge, only need to have the ability to recognize and learn patterns relevant to what is coded in them. This allows them to teach themselves.
Function by reinforcement signals after a certain behavior
Benefits of RL in AI?
They can teach themselves very quicly
Simplifies coding process
What is the alignment problem in AI?
The system is designed to respond to unpredictable scenarios in flexible ways to attain the pre-specified goal. This might lead them to act in ways that we don’t want them to.