Week 6
Learning
Definition: Learning is the process of acquiring new knowledge, behaviours, or abilities through experience.
Key Characteristics:
Long Term: It results in a permanent change in behaviour or knowledge.
Experience: It is caused by experience and cannot be the product of fatigue, motivation, or drugs.
Behavioural Evidence: Learning per se cannot be measured directly, but it is evidenced by changes in behaviour, knowledge, or brain function.
Comparison: Learning contrasts with innate behaviours, including instincts and reflexes.
Reflex
Definition: an automatic and involuntary action of an organ or muscle in response to an external stimulus. An innate behaviour—does not require learning to be performed.
For example: knee jerk, pupil constriction, shivering to generate heat
Instinct
Definition: behaviour or reaction in response to complex scenarios. An innate behaviour—does not require learning to be performed.
For example: migration, hibernation, eating, drinking and sleeping
Associative learning
Definition: This type of learning occurs when an organism learns to associate two or more stimuli. This connection allows organisms to anticipate or react to events based on the relationship between the stimuli.
Types:
Classical Conditioning: Involves learning to associate an involuntary response and a stimulus.
Operant Conditioning: Involves learning to associate a voluntary behaviour and a consequence.
Outline classical conditioning
Classical Conditioning
Definition: The process by which we learn to associate stimuli and consequently associate events with one another.
Timing: The events that we learn to associate happen at the same time.
Common Form: It is a very common form of learning with many applications in everyday life.
Examples:
Electric fences in farmland.
Responses triggered by the smell of food or coffee in the morning.
Smartphone notifications.
Traffic lights.
Brain Involvement:
Main areas involved: Cerebellum and prefrontal cortex.
Outline the history of classical conditiong.
1980s:
Jacque Loeb: animal behaviour can be explained purely in physical terms
1923:
Edward Tolman bridged behaviourism with cognitivism
1950s:
Cognitive psychologists including George Millerand , Noam Chomsky focus on understanding cognitive processes
Outline the method and results of Pavlov’s conditioning experiment
Method:
Showing an unconditioned stimulus (e.g. food) and an unconditioned response (this is an innate response).
Then, show a neutral stimulus (e.g. bell) and get no response from the dog
Then place food and bell together, which creates an unconditioned response (this is the conditioning phase)
After conditioning, the conditioned stimulus (originally the neutral stimulus) creates a conditioned response.
Association:
Food AND Bell == Dog salivates.
Food OR Bell == Dog salivates.
Types of Responses:
Unconditioned/Unlearned Responses: Reflexes.
Conditioned/Learned Responses: Elicited in the presence of conditioned stimuli (after an association has been made).
Outline Pavlov's principles of conditioning
Conditioning:
Responses can be conditioned by pairing stimuli.
Eventually, the stimulus alone will elicit the response.
Consistency:
The type of stimulus is irrelevant as long as it is consistent (e.g., metronome, buzzer, whistle, lights can all produce responses).
Negative Experiences:
The same principles can be applied to negative experiences (e.g., fear conditioning).
Describe the Little Albert Experiment (Watson & Raynor, 1920)
Purpose: Classical conditioning of a fear response in a human infant.
Method:
Loud noise induced by banging an iron bar with a hammer (unconditioned stimulus) elicited a fear response along with the presence of a rat (conditioned stimulus).
A conditioned fear response was developed where the child became scared of the rat.
Stimulus Generalization: Fear was also developed for other similar objects that looked like the rat.
Outcome: Researchers intended to decondition the induced fear, but the experiment was concluded before that.
State the processes within classical conditioning.
Acquisition
Extinction
Spontaneous recovery
Stimulus generalisation
Outline the acquisition process in classical conditioning.
Definition: The initial period of learning.
Process:
During acquisition, the neutral stimulus begins to produce conditioned responses.
We learn to link the response to the event.
Timing: Very important for this stage of learning.
There needs to be only a short time interval between the presentation of the conditioned and unconditioned stimulus for the organism to make the association (less than 5 seconds).
Outline the extiniction process in classical conditioning.
The decrease in the conditioned response because of the unconditioned stimulus is no longer present.
Outline the spontaneous recovery process in classical conditioning.
The response to the conditioned stimulus returns as the stimulus is presented again.
Outline the stimulus generalisation process in classical conditioning.
When the conditioned response is demonstrated for conditioned stimuli that are similar.
Habituation
Definition: the process of not responding to stimuli that are presented repeatedly and without change.
Location: In the brain, the process involves the amygdala with a lot of contributions from the prefrontal cortex and hippocampus
Operant coniditong
Definition: The type of learning where behaviour is controlled by consequences—more complex behaviour than what classical conditioning explains.
Key Concepts:
Positive Reinforcement: Adding a pleasant stimulus to increase a behaviour.
Negative Reinforcement: Removing an unpleasant stimulus to increase a behaviour.
Positive Punishment: Adding an unpleasant stimulus to decrease a behaviour.
Negative Punishment: Removing a pleasant stimulus to decrease a behaviour.
Compare operant and classical conditioning
Classical Conditioning:
Conditioning Approach: An unconditioned stimulus is paired with a neutral stimulus. The neutral stimulus eventually becomes the conditioned stimulus, which elicits the conditioned response.
Stimulus Timing: The stimulus occurs immediately before the response.
Operant Conditioning:
Conditioning Approach: The target behaviour is followed by reinforcement or punishment to either strengthen or weaken the behaviour, making the organism more likely to repeat that behaviour in the future.
Stimulus Timing: The stimulus occurs soon after the response.
Describe Edward Thorndike’s puzzle box.
Introduction: Edward Thorndike – first behaviourist to use animals in psychology experiments.
Puzzle Box:
Specifically designed enclosures are used in psychology experiments to study animal behaviour.
The animal needs to perform a specific action (e.g., pressing a lever or pulling a string) to escape and access a reward, usually food.
Observation: This setup demonstrated instrumental learning, where animals learned through trial and error
Outline B.F. Skinner's contribution to operant conditioning with the Skinner Box.
Skinner Box:
Operant Conditioning Setup – Similar to the Puzzle Box, designed to study learning through rewards.
Lever for Food Reward – Animals press a lever to receive food.
Stimulus and Response Tracking – Speakers and lights signal behaviors, while a recorder logs the number of responses.
Describe learning and reinforcement in operant conditioning
Learning and Reinforcement in Operant Conditioning
Learning: Occurs via reinforcement, which is used to motivate the likelihood of a behavior occurring in the future.
Schedule of Reinforcement: Determines how reinforcement is distributed in time and space.
Types of Reinforcement:
Positive Reinforcement: Presenting a reinforcing stimulus after the desired behavior is exhibited, making the behavior more likely to occur.
Negative Reinforcement: Removing an aversive stimulus after a desired behavior to increase its likelihood.
Describe learning and punishment in operant conditioning
Learning and Punishment in Operant Conditioning
Learning: Can also occur via punishment, which is used to help decrease the likelihood of a behavior occurring in the future.
Types of Punishment:
Positive Punishment: Presenting an aversive consequence after an undesired behavior is exhibited, making behaviors less likely to happen in the future.
Negative Punishment: Removing the reinforcing stimulus after a particular behavior is exhibited, resulting in the behavior happening less often in the future.
Describe shaping in operant conditioning
Shaping in Operant Conditioning
Definition: An approach to learning a desired behaviour where, instead of only rewarding the target behaviour, we reward successive approximations of a target behaviour.
Purpose: Necessary for more complex learning, as it is improbable that a person will display the exact desired behaviour spontaneously.
Example: Teaching a pigeon to play ping pong
Steps:
Reinforce any response that resembles the desired behaviour.
Reinforce any response that more closely resembles the desired behaviour.
Reinforce any response that even more closely resembles the desired behaviour.
Primary Reinforcer
Natural response (related to survival) to a stimulus – unconditioned stimulus, which activates more primitive regions in the brain.
Secondary Reinforcer
Must be learned, consequently, conditioned. These reinforcers are activate “newer” regions including the prefrontal cortex.
Partial reinforcement
Refers to a conditioning process in which a behavior or response is reinforced only a portion of time, rather than every time it occurs.
Describe the four aspects of the reinforcement schedule.
Fixed Interval
Description: Reinforcement delivered at predictable time intervals, e.g. after 5, 10, 15 mins
Result: Moderate response rate with significant pauses after reinforcement
Variable Interval
Description: Reinforcement delivered at unpredictable time intervals, e.g. after 5, 7, 19 mins
Result: Moderate, yet steady response rate
Example: Instagram
Fixed Ratio
Description: Reinforcement delivered after a predictable number of responses, e.g. 5, 10, 15 responses
Result: High response rate with pauses after reinforcement stopped
Variable Ratio
Description: Reinforcement delivered after an unpredictable number of responses, e.g. 5, 7, 19 responses
Result: High and steady response rate
Example: Gambling
Outline the Rescorla Wagner model of reinforcement learning
Rescorla-Wagner Model of Reinforcement Learning
Developed By: Robert Rescorla and Allan Wagner in the 1970s.
Purpose: A mathematically precise explanation of how animals learn associations between conditioned stimuli (CS) and unconditioned stimuli (US).
Model:
Equation: 𝜟𝑽 = 𝒂𝜷(𝝀 −𝜮𝑽).
Computational Psychiatry and Psychology:
Model Definition: A representation, in mathematical terms, of a process that we wish to study.
Parameter Changes: By changing the parameters of the model, we can observe how the outcome changes.
Define prediction error and its types (Rescorla Wagner model)
Definition: They are a key element of learning and represent the difference between the expected outcome and the actual one.
Role in Learning: Prediction errors offer a way to conceptualize and model the element of surprise that is necessary for learning to occur.
Types:
Positive Prediction Error: The actual outcome is better than expected.
Negative Prediction Error: The actual outcome is worse than expected.
Outline the components of the equation in the Rescorla Wagner
Equation: ΔV= aβ(λ -ΣV)
ΔV: change in associative strength (the strength of association between the conditioned stimulus and the unconditioned stimulus)
a: salience of the CS
β: salience of the US
λ: the outcome- 1 when the US is present and 0 when the US is absent
ΣV: the sum of all the associative strengths of all the stimuli present on that trial
λ -ΣV: These terms represent the prediction error
As 𝝀 is what happens in each trial and 𝜮𝑽 is what you expect to happen based on the outcome of previous trials
Outline Rescorla Wagner Model learning curve
As learning progresses, the associative strength between stimuli asymptotically tends toward 1, meaning associations between the US and the CS have been learned.
Outline Rescorla Wagner Model: learning and extinction curve
If the US is now presented on its own, the associative strength decreases and the association is eventually “unlearned”