Chap 6: Learning

Classical Conditioning

Ivan Pavlov

Found that dogs learned to pair the sounds in the environment where they were fed with the food that was given to them and begin to salivate simply upon hearing the sounds.
Classical Conditioning
- Learning to associate neutral stimuli (e.g. sounds) with stimuli that produces a reflexive, involuntary response (e.g. food) and will learn to respond similarly to the new stimulus as they did to the old one (e.g. salivate)

Unconditioned Stimulus (US)

Something that elicits a natural, reflexive response (e.g. food)

Unconditioned Response (UR)

The response the US elicits (e.g. food eliciting salivation)

Through repeated pairings with a neutral stimulus (e.g. bell), animals will come to associate the 2 stimuli together. Ultimately, animals will salivate when hearing the bell alone.

Once the bell elicits salivation, a conditioned response (CR), it is no longer a neutral stimulus but rather a conditioned stimulus (CS)

TERMINOLOGY

Unconditioned Stimulus (US) - a stimulus that evokes an unconditioned response without any prior conditioning (no learning needed for the response to occur).
Unconditioned Response (UR) - an unlearned reaction/response to an unconditioned stimulus that occurs without prior conditioning.
Conditioned Stimulus (CS) - a previously neutral stimulus that has, through conditioning, acquired the capacity to evoke a conditioned response.
Conditioned Response (CR) - a learned reaction to a conditioned stimulus that occurs because of prior conditioning.

METHODS OF CONDITIONING

Delayed Conditioning - CS is presented before the US and it (CS) stays on until the US is presented. Works the best especially if delay is short.
Trace Conditioning - The presentation of the CS, followed by a short break, followed by the presentation of the US.
Simultaneous Conditioning - CS and US are presented at the same time
Backward Conditioning - US is presented first and is followed by the CS. This method is particularly ineffective.

Extinction

The process of unlearning a behaviour (taken place when CS no longer elicits the CR).
Achieved by repeatedly presenting the CS without the US, thus breaking the association between the two.

Spontaneous Recovery

Sometimes, after a conditioned response has been extinguished and no further training of the animals has taken place, the response briefly reappears upon presentation of the conditioned stimulus.

Generalization

Often animals conditioned to respond to a certain stimulus will also respond to similar stimuli, although the response is usually smaller in magnitude.

Discriminate

Training subjects to tell the difference between various stimuli

John B. Watson and Rosalie Rayner

Conditioned a little boy named Albert to fear a white rat.
Albert initially liked the white, fluffy rat, but by repeatedly pairing it with a loud noise, Watson and Rayner taught Albert to cry when he saw the rat.
Loud noise is the US because it elicits the involuntary, natural response of fear (UR).
Rat is a neutral stimulus that becomes the CS, and the CR is crying in response to presentation of the rat alone.
Albert also generalized, crying in response to a white rabbit, a man’s white beard, etc.
Example of aversive conditioning
- The process by which a noxious or unpleasant stimulus is paired with an undesired behaviour.
- Another example: placing unpleasant-tasting substances on the fingernails to discourage nail-chewing

Second-order or higher-order conditioning

Once a CS elicits a CR, it is possible, briefly, to use that CS as a US in order to condition a new response to a new stimulus.
For example, after the dog salivates to the bell (first-order conditioning), the bell can be paired repeatedly with a flash of light, and the dog will salivate to the light alone (second-order conditioning)

Research suggests that animals and humans are biologically prepared to make certain connections more easily than others.

Conditioned Taste Aversions

A learned association between the taste of a particular food and illness such that the food is considered to be the cause of the illness
Can result in powerful avoidance responses on the basis of a single pairing
The CS must be salient (distinctive or prominent) in order for us to avoid it

John Garcia and Robert Koelling

Performed a famous experiment illustrating how rats more readily learned to make certain associations than others.
Used 4 groups of subjects in their experiment and exposed each to a particular combination of CS and US

Rats learned to associate noise with shock and unusual-tasting water with nausea
They were unable to make the connection between noise and nausea and between unusual-tasting water and shock.
- Linking loud noise → shock and unusual-tasting water → nausea seems to be adaptive
- Garcia effect: the ease with which animals learn taste aversions

Operant Conditioning

Operant conditioning is a kind of learning based on association of consequences with one’s behaviours.

Edward Thorndike

Conducted a series of famous experiments using a cat in a puzzle box.
Hungry cat was locked in a cage next to a dish of food; cat had to get out of the cage in order to get the food
Found that the amount of time required for the cat to get out of the box decreased over a series of trials.
As the amount of time decreased, the cat did not seem to understand, suddenly, how to get out of the cage
Led Thorndike to assert that the cat learned the new behaviour without mental activity but rather simply connected a stimulus and a response
Law of Effect
- If the consequences of a behaviour are pleasant, the stimulus-response (S-R) connection will be strengthened and the likelihood of the behaviour will decrease.
- If the consequences of a behaviour are unpleasant, the S-R connection will weaken and the likelihood of the behaviour will decrease.
- Used the term instrumental learning to describe his work because he believed the consequence was instrumental in shaping future behaviours

B. F. Skinner

Used a Skinner box to research animal learning
Box has a way to deliver food to an animal and a lever to press or disk to peck in order to get the food
Food is the reinforcer and the process of giving the food is reinforcement
Reinforcement is defined by its consequences; anything that makes a behaviour more likely to occur is a reinforcer.
- Positive reinforcement - Addition of something pleasant
- Negative reinforcement - Removal of something unpleasant
If we give a rat in a Skinner box food when it presses a lever, we are using positive reinforcement. If we give the rat an electrical shock when it presses a lever, we are using negative reinforcement.
Escape Learning
- Allows one to terminate an aversive stimulus
Avoidance learning
- Enables one to avoid the unpleasant stimulus altogether

Punishment

Anything that makes a behaviour less likely through unpleasant consequences
Positive Punishment - Addition of something unpleasant
Omission Training or Negative Punishment - Removal of something pleasant

Shaping

The production of new forms of operant behaviour by reinforcement of successive approximations to the behaviour
E.g. First the rat might be reinforced to go to the side of the box with the lever. Then we reinforce the rat to touch the lever with any part of its body.

Chaining

Teaching method based on task analysis, wherein all the smaller units of behaviour comprising a complex skill or task is identified and broken down first and the series of related behaviours is taught in a step by step manner.

Two main types of reinforcers:

Primary Reinforcers

Reinforcers that are naturally rewarding. (e.g. food, water, rest, etc.)

Secondary Reinforcers

Things we have learned to value. (e.g. praise, chance to play a video game, etc.)

Generalized Reinforcer: Money

Money is a special kind of secondary reinforcer, a generalized reinforcer, because they can be traded for virtually anything

Token Economy

In a token economy, every time people perform a desired behaviour, they are given a token (e.g. money). Periodically, they are allowed to trade their tokens for any one of a variety of reinforcers.
Used in prisons, mental institutions, schools, etc.

Premack Principle

Whichever of 2 activities is preferred can be used to reinforce the activity that is not preferred.

Reinforcement Schedules

Continuous Reinforcement

Rewarding the behaviour each time (best when first teaching it)

Partial-Reinforcement Effect

Increased resistance to extinction after intermittent reinforcement rather than after continuous reinforcement.

Reinforcement schedules differ in 2 ways:

What determines when reinforcement is delivered → the number of responses made (ratio schedules) or the passage of time (interval schedules)
The pattern of reinforcement → either constant (fixed schedules) or changing (variable schedules)

Variable schedules are more resistant to extinction than fixed schedules (a break in the pattern for fixed schedules can lead to extinction)

Ratio schedules promote higher rates of responding than interval schedules

Instinctive Drift

The tendency for animals to forgo rewards to pursue their typical patterns of behaviour

Cognitive Learning

Contiguity model (Pavlovian model)

The more times 2 things are paired, the greater the learning that will take place
Contiguity (togetherness) determines the strength of the response

Robert Rescorla

Revised the contiguity model to take into account a more complex set of circumstances
Contingency model
- A is contingent upon B when A depends upon B and vice versa.
- The predictability of occurence of one stimulus from the presence of another

→ Pavlov’s contiguity model of classical conditioning holds that the strength of an association between two events is closely linked to the number of times they have been paired in time.

→ Rescorla’s contingency model of classical conditioning reflects more of a cognitive spin, positioning that it is necessary for one event to reliably predict another for a strong association between the two to result.

Observational Learning

Species specific → only occurs between members of the same species

Modeling

Two components: observation and imitation
Mental representation of the observed behaviour must exist in order to enable the person or animal to imitate it

Albert Bandura

Studied modelling in the Bobo Doll Experiment
- Demonstrated that children who had watched an adult being violent with a Bobo doll were more likely to behave aggressively toward the doll than were children who had watched an adult being nonviolent toward it.

Latent Learning

Latent means hidden, and latent learning is learning that becomes obvious only once a reinforcement is given for demonstrating it.

Behaviourists asserted that learning is evidenced by gradual changes in behaviour, but Tolman conducted a famous experiment illustrating that sometimes learning occurs but is not immediately evidenced.

Edward Tolman

Had 3 groups of rats run through a maze on a series of trials.
- Group 1: Got a reward each time it completed the maze; performance improved steadily over the trials
- Group 2: Never got the reward; performance improved only slightly over the trials
- Group 3: Not rewarded in first half, but was rewarded in second; performance improved dramatically and suddenly once it began to be rewarded
Reasoned that these rats must have learned their way around the maze during the first set of trials; performance did not improve because they had no reason to run the maze quickly.
Latent learning is demonstrated → The rats made a mental representation (cognitive map) of the maze during the first half

Abstract Learning

Acquiring knowledge of general or intangible material, such as the meanings of concepts and propositions and the logical and systematic relations between them

Insight Learning

Occurs when one suddenly realizes how to solve a problem.

Wolfgang Köhler

Argued that learning often happens in this sudden way due to insight rather than because of the gradual strengthening of he S-R connection
Suspended a banana from a ceiling wall and had several boxes, none of which was high enough to enable the chimpanzees to reach the banana. Köhler found that the chimps spent most of their time unproductively rather than slowly working towards a solution until, all of a sudden, they would pile the boxes on top of each other, climb up, and grab the banana.
Köhler believed that the solution could not occur until the chimpanzees had a cognitive insight about how to solve the problem