28d ago

Learning and Observational Learning Flashcards

Learning (Classical)Organism and EnvironmentHabituationClassical ConditioningPavlov's ExperimentGeneral Description of Classical ConditioningDefinition of Classical ConditioningHuman Classical ConditioningApplied Issue: Bed WettingConditioned Emotional ResponsesConditioned FearAdvertisingFetishesOther Classical Conditioning ExamplesRelation between the UR and the CRCompensatory-Reaction HypothesisInvolved in Drug Tolerance?Involved in Drug Overdose?Major Phenomena of Classical ConditioningAcquisitionCS-US Temporal RelationsContingencyExtinctionSpontaneous RecoveryBehavior Therapy ApplicationFloodingStimulus GeneralisationStimulus DiscriminationGeneralisation GradientsDiscrimination TrainingBehavior Therapy ApplicationSystematic DesensitizationBlockingHigher-Order ConditioningSensory PreconditioningSummary of Classical ConditioningBiological ConstraintsTaste Aversion LearningAssociations between US & CSTheoretical ImplicationsApplied IssuesThe Hungry Coyotes (Gustavson et al., 1976)ChemotherapyInstrumental ConditioningLaw of EffectClassical versus Instrumental ConditioningB.F. Skinner (1904-1990) - AmericanSkinner's Version of the Law of EffectAcquisitionReinforcementPunishmentConditioned Reinforcers and PunishersExperimental Analysis of BehaviourSchedules of ReinforcementFour of the most simple SchedulesExtinctionTime OutPremack's PrincipleStimulus Control in Instrumental ConditioningStimulus Generalisation and DiscriminationGeneralisation of PunishmentSelective Stimulus ControlNatural Categories or ConceptsBiological Constraints on Instrumental ConditioningThe Misbehaviour of OrganismsExample 1Example 2Species-Specific Defense ReactionsLatent LearningObservational LearningAnecdotal EvidenceEvolutionary RationaleProcedure Experiment 1Procedure Experiment 2Cook & Mineka, S. (1987). Journal of Abnormal Psychology, 98, 448-459ProcedureHumansAlso, Importance of Consequences

Learning (Classical)

Organism and Environment

  • Behaviours selected by evolution:

    • Reflexive: Eye-blinking, sucking, and gripping in new-born humans.

    • Instinctual: Imprinting, homing, migratory behaviours.

  • Behaviours selected by experience

    • Learning: A relatively permanent change in behaviour or knowledge as a result of experience.

      • By habituation.

      • By the association of events - Classical Conditioning.

      • By the consequences of events - Instrumental conditioning.

      • By the observation of events - Observational learning.

Habituation

  • Habituation is the decline in the tendency to respond to stimuli that have become familiar due to repeated exposure.

  • Startling to a new sight or sound decreases quickly with experience.

  • Young turkeys show an alarm response to "hawk" shape but not to "goose" shape.

    • The turkeys have habituated to the more frequent goose shape.

Classical Conditioning

  • Ivan Pavlov (1849-1936) - Russian

    • Personal life: impractical, absent-minded, sentimental

    • Professional side: punctual, perfectionist, tyrant.

    • Trained in medicine, turned to the digestive process (Nobel Prize in 1904 for physiology).

Pavlov's Experiment
  • Investigating the digestive system:

    • Present Food $\rightarrow$ Record Salivation and Gastric excretions

    • Food bowl alone $\rightarrow$ Salivation

    • Experimenter alone $\rightarrow$ Salivation

  • First, TONE alone $\rightarrow$ No change in salivation

  • Second, repeated pairings of TONE + FOOD $\rightarrow$ SALIVATION

  • Finally, TONE alone $\rightarrow$ SALIVATION.

General Description of Classical Conditioning
  1. Present stimuli in isolation:

    • Neutral Stimulus (NS) $\rightarrow$ No response

    • Unconditioned Stimulus (US) $\rightarrow$ Unconditioned Response (UR)

  2. NS immediately precedes US - pair repeatedly:

    • NS + US $\rightarrow$ UR

  3. Present previously neutral stimulus alone:

    • Conditioned Stimulus (CS) $\rightarrow$ Conditioned Response (CR)

Definition of Classical Conditioning
  • A neutral stimulus (NS) is repeatedly paired with a stimulus (US) that automatically elicits a particular response (UR).

  • The previously neutral stimulus becomes a conditioned stimulus (CS) that also elicits a similar response (CR).

  • Found in many species.

Human Classical Conditioning
  • In the laboratory:

    1. US (puff of air) $\rightarrow$ UR (eye-blink)

    2. NS (soft click) $\rightarrow$ no eye-blink

    3. NS (click) + US (air) $\rightarrow$ UR (eye blink)

    4. CS (click) $\rightarrow$ CR (eye-blink)

Applied Issue: Bed Wetting
  • Bladder feels full (US) $\rightarrow$ Child wets bed, keeps sleeping (UR)

  • Bladder feels full (CS) $\rightarrow$ Child wakes up (CR)

  • Treatment involves a device that sounds an alarm (US) when the child wets the bed (CS), leading to waking up (UR).

  • Later, the feeling of a full bladder (CS) elicits waking up (CR).

Conditioned Emotional Responses
  • Many emotions carry distinct physiological correlates, such as increased heart rate, "hair standing on end", flushes, and muscle tension.

  • Neutral stimuli (sounds, smells) associated with emotional events can elicit emotional responses.

Conditioned Fear
  • Children: "Little Albert" and J.B. Watson & Rosalie Raynor (Passer & Smith, p. 242).

    • Watson challenged therapies and approaches of that time (especially Freudian analysis).

  • Adults: Can be long lasting - Edwards & Acker (1972) found that WWII veterans had changes in GSR to the sounds of battle even 15 years after the war.

  • Everyday life (e.g., dentist's waiting room).

Advertising
  • McBurger + Cute children + Bubbly music $\rightarrow$ "The warm fuzzies"

  • McBurger $\rightarrow$ "The warm fuzzies"

  • Other advertisements create a mood (e.g., skiing, high-paced music).

Fetishes
  • A person has heightened sexual arousal in the presence of certain inanimate objects (e.g., shoes, rubber).

  • The object has become a conditioned stimulus that can elicit arousal on its own.

Other Classical Conditioning Examples
  • Allergic reactions, anticipatory nausea, immune responses (Passer & Smith, p. 243)

Relation between the UR and the CR

  • Pavlov believed that the CS came to elicit the CR by a process of stimulus substitution, i.e., the CS was equivalent to the US.

  • However, while UR and CR are often very similar, they are not necessarily identical.

    • Consider: Tone (CS) $\rightarrow$ Salivation (CR)

    • Salivation is less copious and has fewer digestive enzymes than if food itself is presented.

  • Classical conditioning is not so much directed toward replacing the US with the CS, but a learning mechanism whereby the CS (and the CR) prepares the animal for the onset of the US and the UR.

Compensatory-Reaction Hypothesis
  • Sometimes, the UR and the CR can be opposites.

  • Insulin injections - insulin depletes blood sugars. After a number of such injections, bodily reactions to the various CS produce opposite response to the drug (i.e., blood sugar levels go up).

  • The body "prepares" itself for the drug, and "tilts" the other way.

Involved in Drug Tolerance?
  • Opiates (e.g., morphine, heroin) produce pain relief, euphoria, and relaxation.

  • After repeated injections, stimuli surrounding drug injections produce a compensatory reaction - depression, restlessness, increased sensitivity to pain.

  • The same effect requires more of the drug because the system has been "tilted" the other way.

Involved in Drug Overdose?
  • The compensatory reaction requires CSs to elicit the physiological "preparedness" for the drug.

  • What if the drug is administered without the compensatory reaction?

  • The same dose might be lethal because the body is unprepared.

  • Siegal (1989) tested the tolerance of rats for "overdoses" of heroin in novel or usual environments.

    • Results indicated that rats with experience in the usual environment had higher tolerance compared to those in a novel environment.

Major Phenomena of Classical Conditioning

Acquisition
  • The process by which a conditioned stimulus comes to produce a conditioned response, i.e., how a NS becomes a CS. (Passer & Smith, p.238)

  • There are a variety of important factors -

    • Number of NS and US pairings & US Intensity

    • The more intense the US, the stronger the CR, and the quicker the rate of conditioning.

CS-US Temporal Relations
  • The timing of the CS and US can be important.

  • Delayed (Forward) Conditioning

    • CS comes immediately before (and overlaps) with US.

    • CS (click) $\rightarrow$ US (air puff)

    • Most effective procedure for acquiring CR. Effective interval depends on the type of CR. (Eye-blink = 0.5 s).

  • Trace (Forward) Conditioning

    • The CS starts and finishes before the US.

    • CS (click) $\rightarrow$ US (air puff)

    • Procedure is less effective than delayed conditioning.

  • Simultaneous Conditioning

    • The CS and the US start and end together.

    • CS (click) $\rightarrow$ US (air puff)

    • Often fails to produce a CR.

  • Backward Conditioning

    • The CS begins after the US.

    • CS (click) $\rightarrow$ US (air puff)

    • The least effective way to acquire the CR (Can actually produce the opposite effect).

Contingency

  • A simple contiguity between the US and CS is not sufficient for conditioning to occur.

  • The CS must also be a reasonable predictor of the US.

  • The strength of the conditioned response depends on how often the CS accompanies the US and how often the CS accompanies no US.

  • For example:

    • 50 trials: click + puff

    • 10 trials: click alone

    • "click" should be a CS $\rightarrow$ CR (eye-blink)

  • But:

    • 50 trials: click + puff

    • 100 trials: click alone

    • unlikely "click" elicits a CR, because "click" is a poor predictor of "puff".

Extinction

  • If the CS is repeatedly presented without the US, then the CR will gradually decrease.

  • The rate of decrease depends on factors such as initial response strength.

Spontaneous Recovery
  • A CS $\rightarrow$ CR relation is extinguished. After a period with no CS presentations, the CS may elicit the CR again.

  • Revived CR is less intense. Although it can recur repeatedly, it re-extinguishes relatively quickly.

Behavior Therapy Application

Flooding
  • Fear elicited by a CS (certain phobias) is eliminated by the process of extinction.

  • Some therapists regard flooding as too stressful for the patient.

  • Spontaneous recovery has obvious implications for therapies such as flooding.

Stimulus Generalisation

  • A conditioned response formed to one conditioned stimulus will occur to other, similar stimuli.

  • "Little Albert" also feared a furry white rabbit, fur coat, Santa Claus mask.

Stimulus Discrimination

  • Stimulus discrimination occurs when an organism does not respond to stimuli that are similar to the stimulus used in training.

Generalisation Gradients
  • Continuous stimulus dimensions can produce generalisation gradients. Stimuli closer to the CS produce greater CRs.

Discrimination Training
  • Stimulus A is associated with the US, and Stimulus B is not. If the subject discriminates, the CR occurs only with A.

Behavior Therapy Application
Systematic Desensitization
  • Combines ideas from extinction, stimulus generalisation, and counter-conditioning.

  • Treatment for phobias and anxiety problems.

Blocking

  • Conditioning does not occur if a good predictor of the US already exists (Kamin, 1969).

  • Experimental Group: noise + shock $\rightarrow$ fear response, light + noise + shock $\rightarrow$ no fear response to light alone

  • Control Group: no training $\rightarrow$ fear response to light alone

Higher-Order Conditioning

  • Once a stimulus has become an effective CS for a certain CR, then that stimulus can be used to condition other stimuli.

    • NS₁ + US $\rightarrow$ NS₁ becomes CS1

    • Then take a second neutral stimulus (NS2)

    • NS2 + CS1 $\rightarrow$ NS2 becomes CS2

Sensory Preconditioning

  • Learning occurs in the absence of UR. Classical conditioning reveals the association already learnt between two events.

    1. Light + Tone (number of trials)

    2. Light + Meat powder $\rightarrow$ Salivation

    3. Light $\rightarrow$ Salivation

    4. Tone $\rightarrow$ Salivation

  • CS-US pairings were not necessary for conditioning.

  • Organisms can make more general associations between stimuli (S-S learning).

Summary of Classical Conditioning

  • Stimulus generalisation, higher-order conditioning, and sensory preconditioning allow learning in one context to extend to a wider range of situations.

  • Stimulus discrimination and blocking limit the extent that learning in one context influences behaviour in other situations.

Biological Constraints

  • So far, US-CS relations have been looked at as largely arbitrary; i.e., any discrete NS can become a CS for a CR.

  • Some behaviour theorists have treated this as a basic principle.

Taste Aversion Learning
  • Group 1:

    • Training: NS = water + saccharine + light + click, US = mild electric shock

    • Test: Water + light + click $\rightarrow$ CR (avoid water), Water + saccharine $\rightarrow$ No effect

  • Group 2:

    • Training: NS = water + saccharine + light + click, US = X-ray exposure (or Lithium Chloride)

    • Test: Water + light + click $\rightarrow$ No effect, Water + saccharine $\rightarrow$ CR (avoid water)

Associations between US & CS
  • Associations between US & CS are more readily formed if they seem to belong together.

  • Lights, clicks, electric shock - "external" pain likely to have an external cause.

  • Saccharine, illness - "internal" pain (nausea) is likely to involve things eaten.

  • Human example: Aversion to certain distinctive alcoholic beverages after a bad experience.

Theoretical Implications
  1. US-CS connections are not arbitrary. Depends on biological constraints or predispositions.

    • A simple temporal contiguity (i.e., delayed conditioning) is not sufficient to produce conditioning.

  2. Conditioned taste aversions can occur after quite long delays between the CS and the UR.

    • So, a close temporal contiguity is not always necessary for conditioning.

Applied Issues
The Hungry Coyotes (Gustavson et al., 1976)
  • Problem - Western USA, coyotes kill sheep.

  • Pilot Study - 6 coyotes and 2 wolves. LiCl treated rabbit and sheep meat. Attacks on live rabbits and sheep greatly reduced.

  • Method - 3000-acre ranch. 12 bait stations where tracks indicated heavy coyote activity. LiCl treated sheep carcasses and dog food in sheep hide.

  • Results - Best estimates suggest a 30% to 60% reduction.

Chemotherapy
  • Chemotherapy often produces severe nausea and vomiting in patients.

  • Loss of weight is not conducive to recovery from cancer - the individual is already sick enough.

  • Is some of the loss of appetite (and weight) due to learned taste-aversions?

  • Bernstein (1978) - cancer ward, children 2 to 16 years.

    • Group 1: Ice-cream + chemotherapy

    • Group 2: No ice-cream + chemotherapy

    • Group 3: Ice-cream + No chemotherapy

    • After 2-4 weeks patients given a choice between "Mapletoff" ice cream and playing a game.

    • Results indicated that Group 1 patients showed less preference for the ice cream after chemotherapy compared to the other groups.

Instrumental Conditioning

  • E. Thorndike (1874-1935) - American

  • In about 1900, he conducted research examining whether animals could solve problems or "think".

  • Thorndike designed a variety of "puzzle boxes" from which the cats had to learn to escape.

  • Thorndike argued "No", because the learning curves show no sudden "insightful" drop.

Law of Effect
  • "Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur"

  • In other words, positive consequences increased the likelihood or probability of a response (rather than a "reflexive" relation).

  • Punishment seen as the opposite of reinforcement.

  • "those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with the situation weakened, so that, when it recurs, they will be less likely to recur."

  • Behaviours are "stamped out" if followed by negative consequences.

  • In Thorndike's puzzle box, the behaviours followed by release were steadily strengthened, while behaviours unrelated to release faded with time.

  • The environment selects the "fittest" behaviours, in the same way it selects the "fittest" individuals of a species.

Classical versus Instrumental Conditioning
  • Classical conditioning is a relation between two stimuli (CS and US). The CS elicits the CR.

  • Instrumental conditioning concerns the probability or likelihood of a response changing as a function of its consequences. The subject emits the response in order to produce a reward.

B.F. Skinner (1904-1990) - American
  • One of the most famous psychologists. A central figure in the area of psychology known as Behaviourism.

  • This movement was a reaction against introspectionism towards more objective measurement in psychology.

  • Reared his infant daughter in an "air crib".

  • "Walden Two" - behavioural utopia.

  • "Verbal Behaviour"

  • WWII work - Project Pigeon.

  • Instrumental conditioning is also called Operant conditioning because the response operates on the environment (produces an effect).

  • The operant is the response defined in terms of its environmental effect.

Skinner's Version of the Law of Effect
  • When a response is followed by a reinforcer, the strength of the response increases. When a response is followed by a punisher, the strength of the response decreases.

Acquisition
  • Behaviour shaped by successive approximations.

  • Training a rat to press a lever - start with a broad response criterion and progressively narrow it.

  • The world is a "trainer".

  • Positive and negative consequences of actions are constantly shaping behaviour repertoires.

Reinforcement
  • Increases the likelihood of behaviour.

  • Positive reinforcement: Adding a stimulus or event contingent upon a response increases that behaviour.

    • Lab: Provide food to a food-deprived rat for lever-pressing.

    • Life: Child receives pocket-money for doing "chores", affection of partner for "kind" act.

  • Negative reinforcement: Removing a stimulus or event contingent upon a response increases that behaviour.

    • Lab: A rat presses a lever to terminate (escape) or prevent (avoidance) an electric shock through the floor.

    • Life: Child does homework to avoid detention (or corporal punishment) at school.

Punishment
  • Decreases the likelihood of behaviour.

  • Positive punishment: Adding a stimulus or event contingent upon a response decreases that behaviour.

    • Lab: Rat receives electric shock for pressing a lever.

    • Life: Corporal punishment, antics on a skateboard.

  • Negative punishment: Removing a stimulus or event contingent upon a response decreases that behaviour.

    • Lab: A lever-press retracts the water spigot from a rat for a fixed period of time.

    • Life: Pocket-money withheld, not allowed to go out.

Conditioned Reinforcers and Punishers
  • Primary reinforcers or punishers seem inherently reinforcing (e.g., food) or punishing (pain).

  • Other stimuli acquire reinforcing or punishing properties by association with primary reinforcers or punishers.

  • Praise is a conditioned reinforcer. "No!" is a conditioned punisher.

  • A token reinforcer (money) can be exchanged for primary reinforcers.

Experimental Analysis of Behaviour
  • Systematic study of the relation between behaviour and its consequences.

  • The Skinner Box (Passer & Smith, p. 253)

Schedules of Reinforcement
  • A schedule of reinforcement is a specific pattern of presenting reinforcers over time (Passer & Smith, p. 256).

  • Continuous reinforcement (CRF): Every instance of a response is reinforced.

    • Useful for learning new behaviours and influencing ongoing patterns of behaviour quickly.

  • Partial (or intermittent) reinforcement: A designated response is reinforced only some of the time.

    • Useful for maintaining behaviours.

Four of the most simple Schedules
  • Ratio: Depends on numbers of responses (fixed and variable).

  • Interval: Depends largely on the passage of time (fixed and variable).

  • Fixed-ratio schedule: The reinforcer is given after a fixed number of non-reinforced responses.

    • Lab: A rat receives food for every tenth response (FR 10).

    • Cumulative record: Post-reinforcer pause, "burst" of responses until the next reinforcer.

  • Variable-ratio schedule: The reinforcer is given after a variable number of non-reinforced responses. The number of non-reinforced responses varies around a predetermined average.

    • Lab: A rat is reinforced for every 10th response on average, but the exact number required varies across trials.

    • Cumulative record: High steady rate of response, occasional pauses.

  • Fixed-interval schedule: The reinforcer is given for the first response after a fixed period of time has elapsed.

    • Lab: Rat is reinforced for the first lever press after 2 minutes has elapsed since the last reinforcer.

    • Cumulative record: Pause after reinforcement, steadily increasing response rate as interval elapses.

  • Variable-interval schedule: The reinforcer is given for the first response after a variable time interval has elapsed. The interval lengths vary around a predetermined average.

    • Lab: Rat is reinforced for the first lever press after 1' on average, but the interval length varies from trial to trial.

    • Cumulative record: High steady rate of response, although not quite as high as a comparable VR schedule, occasional pauses.

Applied Issue

  • Engineering compensation systems: Effects of commissioned versus wage payment (Gaetani et al., 1986, J. Org. Beh. Man).

  • Setting: 2 machinists - high-performance auto-machining shop.

    • Baseline, basic wages ($$5-$7 per hour).

    • Feedback phase.

    • Return to baseline.

    • Commissioned compensation with daily feedback and quality control. If worked at baseline rate, would still get equivalent of baseline "wage".

  • However, must avoid competition between workers, especially if there is a limited amount of work (waiters and trappers).

Extinction
  • Reinforcers are no longer delivered contingent upon a response, and the strength of the response decreases (Passer & Smith, p. 258).

Partial-Reinforcement Extinction Effect

  • Partial reinforcement schedules (FR, VR, FI, VI) provide greater resistance to extinction.

  • This resistance to extinction is useful from an applied perspective.

  • Why this partial reinforcement extinction effect?

    • A subject trained with partial reinforcement has learned that reinforcement follows non-reinforcement.

    • Learned to persist in the face of frustration produced by absence of reinforcement.

Applied Issue

  • The side-effects of extinction:

    • Extinction "bursts" of responses.

    • Extinction induced aggression.

    • Increase in response topography.

  • Typical problems using extinction:

    • Inability to control all sources of reinforcement for the behaviour.

    • Failure to provide alternative (appropriate) behaviours leading to the same reinforcer.

    • Spontaneous recovery.

Time Out
  • NOT extinction - negative punishment (removing a broad range of +ve reinforcers).

  • Brief, safe procedure, but not without potential problems.

    • Ability to implement.

    • Time out should remove reinforcers.

    • Is time in reinforcing?

    • Abuse because reinforcing for parent or teacher remove aversive stimulus.

Premack's Principle
  • What will be reinforcing?

  • Transituational reinforcement

    • Theory focuses on general causal stimuli.

    • Reinforcers and punishers form unique and independent sets of transituationally effective stimuli.

  • Consummatory responses (unreinforcible?).

  • Relatively long-term deprivation is necessary in order to use one of these gold standards.

A Premack Experiment

Part 1: Non-deprived rats

  • Measure baseline rates of running and drinking (BASE).

  • More running than drinking.

  • Now the running wheel only active for short periods following some drinking (FR 30).

  • Drinking increases.

Part 2: Deprive the rats of water

  • Baseline drinking now much higher (BASE).

  • Arrange it such that following drinking, the running wheel is activated, and the rat is forced to run (FR 15, FR 5).

  • The rate of drinking decreases.

Problems for the Idea of Transituational Reinforcers

  • In Part 1, "running" reinforces a consummatory response (drinking). This should not occur.

  • In Parts 1 and 2, wheel-running was both a reinforcer and a punisher of drinking. Transituational reinforcement doesn't allow this dual role.

  • Challenges concept of reinforcers as stimuli.

  • Instead, behaviours are characterised as either high probability or low probability. Behaviour is reinforced when it is followed by higher probability behaviours.

  • Can measure the probabilities of behaviours a priori, therefore should be able to predict what behaviour will reinforce other behaviours in a situation.

  • The probabilities of behaviours can vary from situation to situation or even as a function of time.

  • Major influence on behaviour modification.

    • Increases scope of what can be an effective reinforcer.

    • Procedures for identifying reinforcers and punishers are clear, yet relatively unobtrusive.

    • Reinforcers can be tailored for specific situations.

    • Deprivation a means of changing probabilities of certain behaviours in a situation.

Examples from Human Studies

Premack principle to control classroom behaviour of nursery school children (Homme et al., 1963)

  • High probability behaviours were running, screaming, pushing chairs, and doing jigsaws.

  • Low probability behaviours were sitting quietly and attending.

  • Sitting quietly was intermittently followed by the sound of a bell and the instruction "Run and scream". After a while, a signal to stop and engage in another low or high probability behaviour.

Two chronic schizophrenics: Coil stripping as part of their "industrial therapy"

  • Reinforcers such as cigarettes, sweets, or fruit had proved ineffective.

  • The high probability behaviour was sitting down doing nothing obvious. Being able to sit was made contingent on coil stripping.

Stimulus Control in Instrumental Conditioning
  • To be effective in the environment, instrumental responses must occur at appropriate occasions.

  • Antecedent stimulus control (cue, signal) instrumental behaviour.

Stimulus Generalisation and Discrimination
  • Extent that stimulus dimensions control behaviour

  • Effects of reinforcement and discrimination training

  • S+ 1000 Hz

  • S- 1000 Hz vs S- No Tone

  • -S+ 1000Hz vs S- 950 Hz

Generalisation of Punishment
  • VI Baseline rewards all key colours (wavelengths of light) equally.

  • Then add occasional punishment for responses to 550nm only.

Selective Stimulus Control
  • Environments are often complex - there's a tendency to ignore redundant information even if it's relevant.

Natural Categories or Concepts
  • Can non-human animals form categories or concepts from complex stimuli?

  • Variety of experiments using a variety of concepts:

    • Humans vs. no-humans

    • Pigeons (various breeds) vs. other birds

    • Trees vs. no-trees & Water vs. no-water

    • White oak leaves vs. other leaves.

  • Some arguments that these sorts of natural concepts were "genetically pre-programmed" in a species such as the pigeon.

  • Herrnstein & DeVilliers (1980).

  • Can pigeons learn the concept "fish"?

  • Demonstration of underwater slides with fish and without fish. These slides the pigeons found easy to discriminate.

  • Demonstration of other slides that pigeons found difficult to discriminate.

  • "Cognitive" performance of non-human animals?

  • Gives us a general understanding about behaviour, perception, and brain functioning in general.

  • Makes us consider what we think is special about human behaviour?

Biological Constraints on Instrumental Conditioning

The Misbehaviour of Organisms
  • Breland & Breland (1961) - using operant conditioning to train animals for advertising gimmicks.

Example 1
  • Train a raccoon to deposit tokens in a metal tin. Training them to pick up the coins was easy. But training them to deposit the coins (especially 2 coins) was a nightmare.

    • Didn't release coins.

    • "Now the raccoon really had problems. Not only could he not let go of the coins, but he spent seconds, even minutes, rubbing them together (in a most miserly fashion), and dipping them into the container. The rubbing became worse and worse as time went on."

Example 2
  • Training pigs to deposit large wooden $$1 coins into a large "piggy" bank. Initially, no problem.

    • "Over a period of weeks the behaviour would become slower and slower. He might run over eagerly for each dollar, but on the way back, instead of carrying the dollar and depositing it simply and cleanly, he would repeatedly drop it, root it, drop it again, root it along the way, pick it up, toss it in the air, drop it, root it some more, and so on….Finally it would take the pig about 10 minutes to transport 4 coins a distance of about 6 feet."

  • After conditioning to a specific response, behaviour "drifted" to examples of instinctive behaviour related to food gathering.

  • Raccoons would rub the coins and dip them, while pigs would root the coins.

  • Importance of instinctive patterns, evolutionary history, and ecological niche for a complete understanding of behaviour.

Species-Specific Defense Reactions
  • Rats learn very easily to press lever for food.

  • Rats learn very easily to jump out of box, or over hurdle, to avoid electric shock.

  • However, it is extremely difficult to train rats to press lever to avoid shock?

  • Why? Fleeing and freezing are dominant responses for rats in defensive situations.

  • Again, there's a clash between instinct and operant.

Latent Learning
  • Learning from experience when there appears to be no obvious reinforcement or punishment for the specific behaviour.

  • Tolman and Hoznik (1930). Rats experience maze without food. Once food introduced, their learning of the maze was much quicker than control groups.

Observational Learning

  • Observational learning occurs when an organism's responding is influenced by the observation of others' behaviour (models).

Anecdotal Evidence
  • There is a variety of anecdotal evidence that social learning and mimicry takes place in a variety of species.

    • English titmouse

    • Parrots' mimicry of speech.

    • "Monkey see, monkey do".

    • Children

Evolutionary Rationale
  • As a mechanism for acquiring adaptive behaviours, it compares favorably with other learning mechanisms.

  • Speed of acquisition is quicker than trial and error learning.

  • It is generally advantageous, or at least harmless, to copy others than be innovative.

  • But is there evidence that animals show observational learning rather than “blind imitation” or “local enhancement”?

  • Palameta, B. & Lefebvre, L. (1985). Animal Behavior, 33, 892-896: Demonstrator bird trained to pierce the red-half of a piece of paper to obtain food.

Procedure Experiment 1
  • Each group has 5 subjects. 10 trials of 10 minutes.

    • Group NM (no model). Never observed the demonstrator.

    • Group BI (blind imitation). Observed demonstrator pierce paper, but no seed available.

    • Group LE (local enhancement). Observed demonstrator obtain seed, but paper pierced in advance.

    • Group OL (observational learning). Observed demonstrator pierce paper and eat seed.

  • NM (No Model): Never observed the demonstrator.

  • BI (Blind Imitation): Observed the demonstrator piercing the paper but no seed available.

  • LE (Local Enhancement): Observed the demonstrator obtain seed, but the paper pierced in advance.

  • OL (Observational Learning): Observed the demonstrator piercing the paper and eating the seed.

Procedure Experiment 2
  • Two groups, Deferred Observational Learning (DOL) and Deferred Local Enhancement (DLE), were trained as before but now tested in the model's absence.

Cook & Mineka, S. (1987). Journal of Abnormal Psychology, 98, 448-459
  • Tried to train fear of snakes in monkeys by observation.

Procedure
  • Videotape a monkey displaying fear of a boa constrictor and no-fear of artificial flowers. Edit the tapes.

  • Group 1: Monkeys see a monkey fear artificial flowers but show no fear of a toy snake.

  • Group 2: Monkeys see a monkey fear a toy snake but show no fear of artificial flowers.

  • Then test both groups. Monkeys must reach past a real snake, a toy snake, or artificial flowers to get food.

  • Fear of snakes learned by observation but also biological constraints – no fear of flowers learned.

Humans
  • Bandura et al. (1963, 1965) - aggression and "Bobo".

  • Implications for modelling aggression (e.g., TV).

  • Bandura et al. (1967) - children's fear of dogs.

    • Showing a boy playing fearlessly with a dog helped reduce fear.

Also, Importance of Consequences
  • Model Rewarded: higher number of imitative responses.

  • Model Punished: lower number of imitative responses.

  • No consequences: Intermediate number of imitative responses.

  • Albert Bandura's Four Key Processes Required for Observational Learning

    1. Attention: Extent to which we focus on others' behaviour

    2. Retention: Retaining a representation of others' behaviour

    3. Production: Ability to actually perform actions we observe

    4. Motivation: Need to perform actions we witness (use


knowt logo

Learning and Observational Learning Flashcards

Learning (Classical)

Organism and Environment

  • Behaviours selected by evolution:
    • Reflexive: Eye-blinking, sucking, and gripping in new-born humans.
    • Instinctual: Imprinting, homing, migratory behaviours.
  • Behaviours selected by experience
    • Learning: A relatively permanent change in behaviour or knowledge as a result of experience.
      • By habituation.
      • By the association of events - Classical Conditioning.
      • By the consequences of events - Instrumental conditioning.
      • By the observation of events - Observational learning.

Habituation

  • Habituation is the decline in the tendency to respond to stimuli that have become familiar due to repeated exposure.
  • Startling to a new sight or sound decreases quickly with experience.
  • Young turkeys show an alarm response to "hawk" shape but not to "goose" shape.
    • The turkeys have habituated to the more frequent goose shape.

Classical Conditioning

  • Ivan Pavlov (1849-1936) - Russian
    • Personal life: impractical, absent-minded, sentimental
    • Professional side: punctual, perfectionist, tyrant.
    • Trained in medicine, turned to the digestive process (Nobel Prize in 1904 for physiology).

Pavlov's Experiment

  • Investigating the digestive system:
    • Present Food $\rightarrow$ Record Salivation and Gastric excretions
    • Food bowl alone $\rightarrow$ Salivation
    • Experimenter alone $\rightarrow$ Salivation
  • First, TONE alone $\rightarrow$ No change in salivation
  • Second, repeated pairings of TONE + FOOD $\rightarrow$ SALIVATION
  • Finally, TONE alone $\rightarrow$ SALIVATION.

General Description of Classical Conditioning

  1. Present stimuli in isolation:
    • Neutral Stimulus (NS) $\rightarrow$ No response
    • Unconditioned Stimulus (US) $\rightarrow$ Unconditioned Response (UR)
  2. NS immediately precedes US - pair repeatedly:
    • NS + US $\rightarrow$ UR
  3. Present previously neutral stimulus alone:
    • Conditioned Stimulus (CS) $\rightarrow$ Conditioned Response (CR)

Definition of Classical Conditioning

  • A neutral stimulus (NS) is repeatedly paired with a stimulus (US) that automatically elicits a particular response (UR).
  • The previously neutral stimulus becomes a conditioned stimulus (CS) that also elicits a similar response (CR).
  • Found in many species.

Human Classical Conditioning

  • In the laboratory:
    1. US (puff of air) $\rightarrow$ UR (eye-blink)
    2. NS (soft click) $\rightarrow$ no eye-blink
    3. NS (click) + US (air) $\rightarrow$ UR (eye blink)
    4. CS (click) $\rightarrow$ CR (eye-blink)

Applied Issue: Bed Wetting

  • Bladder feels full (US) $\rightarrow$ Child wets bed, keeps sleeping (UR)
  • Bladder feels full (CS) $\rightarrow$ Child wakes up (CR)
  • Treatment involves a device that sounds an alarm (US) when the child wets the bed (CS), leading to waking up (UR).
  • Later, the feeling of a full bladder (CS) elicits waking up (CR).

Conditioned Emotional Responses

  • Many emotions carry distinct physiological correlates, such as increased heart rate, "hair standing on end", flushes, and muscle tension.
  • Neutral stimuli (sounds, smells) associated with emotional events can elicit emotional responses.
Conditioned Fear
  • Children: "Little Albert" and J.B. Watson & Rosalie Raynor (Passer & Smith, p. 242).
    • Watson challenged therapies and approaches of that time (especially Freudian analysis).
  • Adults: Can be long lasting - Edwards & Acker (1972) found that WWII veterans had changes in GSR to the sounds of battle even 15 years after the war.
  • Everyday life (e.g., dentist's waiting room).

Advertising

  • McBurger + Cute children + Bubbly music $\rightarrow$ "The warm fuzzies"
  • McBurger $\rightarrow$ "The warm fuzzies"
  • Other advertisements create a mood (e.g., skiing, high-paced music).

Fetishes

  • A person has heightened sexual arousal in the presence of certain inanimate objects (e.g., shoes, rubber).
  • The object has become a conditioned stimulus that can elicit arousal on its own.

Other Classical Conditioning Examples

  • Allergic reactions, anticipatory nausea, immune responses (Passer & Smith, p. 243)

Relation between the UR and the CR

  • Pavlov believed that the CS came to elicit the CR by a process of stimulus substitution, i.e., the CS was equivalent to the US.
  • However, while UR and CR are often very similar, they are not necessarily identical.
    • Consider: Tone (CS) $\rightarrow$ Salivation (CR)
    • Salivation is less copious and has fewer digestive enzymes than if food itself is presented.
  • Classical conditioning is not so much directed toward replacing the US with the CS, but a learning mechanism whereby the CS (and the CR) prepares the animal for the onset of the US and the UR.

Compensatory-Reaction Hypothesis

  • Sometimes, the UR and the CR can be opposites.
  • Insulin injections - insulin depletes blood sugars. After a number of such injections, bodily reactions to the various CS produce opposite response to the drug (i.e., blood sugar levels go up).
  • The body "prepares" itself for the drug, and "tilts" the other way.
Involved in Drug Tolerance?
  • Opiates (e.g., morphine, heroin) produce pain relief, euphoria, and relaxation.
  • After repeated injections, stimuli surrounding drug injections produce a compensatory reaction - depression, restlessness, increased sensitivity to pain.
  • The same effect requires more of the drug because the system has been "tilted" the other way.
Involved in Drug Overdose?
  • The compensatory reaction requires CSs to elicit the physiological "preparedness" for the drug.

  • What if the drug is administered without the compensatory reaction?

  • The same dose might be lethal because the body is unprepared.

  • Siegal (1989) tested the tolerance of rats for "overdoses" of heroin in novel or usual environments.

    • Results indicated that rats with experience in the usual environment had higher tolerance compared to those in a novel environment.

Major Phenomena of Classical Conditioning

Acquisition

  • The process by which a conditioned stimulus comes to produce a conditioned response, i.e., how a NS becomes a CS. (Passer & Smith, p.238)
  • There are a variety of important factors -
    • Number of NS and US pairings & US Intensity
    • The more intense the US, the stronger the CR, and the quicker the rate of conditioning.
CS-US Temporal Relations
  • The timing of the CS and US can be important.
  • Delayed (Forward) Conditioning
    • CS comes immediately before (and overlaps) with US.
    • CS (click) $\rightarrow$ US (air puff)
    • Most effective procedure for acquiring CR. Effective interval depends on the type of CR. (Eye-blink = 0.5 s).
  • Trace (Forward) Conditioning
    • The CS starts and finishes before the US.
    • CS (click) $\rightarrow$ US (air puff)
    • Procedure is less effective than delayed conditioning.
  • Simultaneous Conditioning
    • The CS and the US start and end together.
    • CS (click) $\rightarrow$ US (air puff)
    • Often fails to produce a CR.
  • Backward Conditioning
    • The CS begins after the US.
    • CS (click) $\rightarrow$ US (air puff)
    • The least effective way to acquire the CR (Can actually produce the opposite effect).

Contingency

  • A simple contiguity between the US and CS is not sufficient for conditioning to occur.
  • The CS must also be a reasonable predictor of the US.
  • The strength of the conditioned response depends on how often the CS accompanies the US and how often the CS accompanies no US.
  • For example:
    • 50 trials: click + puff
    • 10 trials: click alone
    • "click" should be a CS $\rightarrow$ CR (eye-blink)
  • But:
    • 50 trials: click + puff
    • 100 trials: click alone
    • unlikely "click" elicits a CR, because "click" is a poor predictor of "puff".

Extinction

  • If the CS is repeatedly presented without the US, then the CR will gradually decrease.
  • The rate of decrease depends on factors such as initial response strength.

Spontaneous Recovery

  • A CS $\rightarrow$ CR relation is extinguished. After a period with no CS presentations, the CS may elicit the CR again.
  • Revived CR is less intense. Although it can recur repeatedly, it re-extinguishes relatively quickly.

Behavior Therapy Application

Flooding

  • Fear elicited by a CS (certain phobias) is eliminated by the process of extinction.
  • Some therapists regard flooding as too stressful for the patient.
  • Spontaneous recovery has obvious implications for therapies such as flooding.

Stimulus Generalisation

  • A conditioned response formed to one conditioned stimulus will occur to other, similar stimuli.
  • "Little Albert" also feared a furry white rabbit, fur coat, Santa Claus mask.

Stimulus Discrimination

  • Stimulus discrimination occurs when an organism does not respond to stimuli that are similar to the stimulus used in training.

Generalisation Gradients

  • Continuous stimulus dimensions can produce generalisation gradients. Stimuli closer to the CS produce greater CRs.
Discrimination Training
  • Stimulus A is associated with the US, and Stimulus B is not. If the subject discriminates, the CR occurs only with A.

Behavior Therapy Application

Systematic Desensitization
  • Combines ideas from extinction, stimulus generalisation, and counter-conditioning.
  • Treatment for phobias and anxiety problems.

Blocking

  • Conditioning does not occur if a good predictor of the US already exists (Kamin, 1969).
  • Experimental Group: noise + shock $\rightarrow$ fear response, light + noise + shock $\rightarrow$ no fear response to light alone
  • Control Group: no training $\rightarrow$ fear response to light alone

Higher-Order Conditioning

  • Once a stimulus has become an effective CS for a certain CR, then that stimulus can be used to condition other stimuli.
    • NS₁ + US $\rightarrow$ NS₁ becomes CS1
    • Then take a second neutral stimulus (NS2)
    • NS2 + CS1 $\rightarrow$ NS2 becomes CS2

Sensory Preconditioning

  • Learning occurs in the absence of UR. Classical conditioning reveals the association already learnt between two events.
    1. Light + Tone (number of trials)
    2. Light + Meat powder $\rightarrow$ Salivation
    3. Light $\rightarrow$ Salivation
    4. Tone $\rightarrow$ Salivation
  • CS-US pairings were not necessary for conditioning.
  • Organisms can make more general associations between stimuli (S-S learning).

Summary of Classical Conditioning

  • Stimulus generalisation, higher-order conditioning, and sensory preconditioning allow learning in one context to extend to a wider range of situations.
  • Stimulus discrimination and blocking limit the extent that learning in one context influences behaviour in other situations.

Biological Constraints

  • So far, US-CS relations have been looked at as largely arbitrary; i.e., any discrete NS can become a CS for a CR.
  • Some behaviour theorists have treated this as a basic principle.

Taste Aversion Learning

  • Group 1:
    • Training: NS = water + saccharine + light + click, US = mild electric shock
    • Test: Water + light + click $\rightarrow$ CR (avoid water), Water + saccharine $\rightarrow$ No effect
  • Group 2:
    • Training: NS = water + saccharine + light + click, US = X-ray exposure (or Lithium Chloride)
    • Test: Water + light + click $\rightarrow$ No effect, Water + saccharine $\rightarrow$ CR (avoid water)
Associations between US & CS
  • Associations between US & CS are more readily formed if they seem to belong together.
  • Lights, clicks, electric shock - "external" pain likely to have an external cause.
  • Saccharine, illness - "internal" pain (nausea) is likely to involve things eaten.
  • Human example: Aversion to certain distinctive alcoholic beverages after a bad experience.
Theoretical Implications
  1. US-CS connections are not arbitrary. Depends on biological constraints or predispositions.
    • A simple temporal contiguity (i.e., delayed conditioning) is not sufficient to produce conditioning.
  2. Conditioned taste aversions can occur after quite long delays between the CS and the UR.
    • So, a close temporal contiguity is not always necessary for conditioning.
Applied Issues

The Hungry Coyotes (Gustavson et al., 1976)

  • Problem - Western USA, coyotes kill sheep.
  • Pilot Study - 6 coyotes and 2 wolves. LiCl treated rabbit and sheep meat. Attacks on live rabbits and sheep greatly reduced.
  • Method - 3000-acre ranch. 12 bait stations where tracks indicated heavy coyote activity. LiCl treated sheep carcasses and dog food in sheep hide.
  • Results - Best estimates suggest a 30% to 60% reduction.

Chemotherapy

  • Chemotherapy often produces severe nausea and vomiting in patients.
  • Loss of weight is not conducive to recovery from cancer - the individual is already sick enough.
  • Is some of the loss of appetite (and weight) due to learned taste-aversions?
  • Bernstein (1978) - cancer ward, children 2 to 16 years.
    • Group 1: Ice-cream + chemotherapy
    • Group 2: No ice-cream + chemotherapy
    • Group 3: Ice-cream + No chemotherapy
    • After 2-4 weeks patients given a choice between "Mapletoff" ice cream and playing a game.
    • Results indicated that Group 1 patients showed less preference for the ice cream after chemotherapy compared to the other groups.

Instrumental Conditioning

  • E. Thorndike (1874-1935) - American
  • In about 1900, he conducted research examining whether animals could solve problems or "think".
  • Thorndike designed a variety of "puzzle boxes" from which the cats had to learn to escape.
  • Thorndike argued "No", because the learning curves show no sudden "insightful" drop.

Law of Effect

  • "Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur"
  • In other words, positive consequences increased the likelihood or probability of a response (rather than a "reflexive" relation).
  • Punishment seen as the opposite of reinforcement.
  • "those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with the situation weakened, so that, when it recurs, they will be less likely to recur."
  • Behaviours are "stamped out" if followed by negative consequences.
  • In Thorndike's puzzle box, the behaviours followed by release were steadily strengthened, while behaviours unrelated to release faded with time.
  • The environment selects the "fittest" behaviours, in the same way it selects the "fittest" individuals of a species.

Classical versus Instrumental Conditioning

  • Classical conditioning is a relation between two stimuli (CS and US). The CS elicits the CR.
  • Instrumental conditioning concerns the probability or likelihood of a response changing as a function of its consequences. The subject emits the response in order to produce a reward.

B.F. Skinner (1904-1990) - American

  • One of the most famous psychologists. A central figure in the area of psychology known as Behaviourism.
  • This movement was a reaction against introspectionism towards more objective measurement in psychology.
  • Reared his infant daughter in an "air crib".
  • "Walden Two" - behavioural utopia.
  • "Verbal Behaviour"
  • WWII work - Project Pigeon.
  • Instrumental conditioning is also called Operant conditioning because the response operates on the environment (produces an effect).
  • The operant is the response defined in terms of its environmental effect.

Skinner's Version of the Law of Effect

  • When a response is followed by a reinforcer, the strength of the response increases. When a response is followed by a punisher, the strength of the response decreases.
Acquisition
  • Behaviour shaped by successive approximations.
  • Training a rat to press a lever - start with a broad response criterion and progressively narrow it.
  • The world is a "trainer".
  • Positive and negative consequences of actions are constantly shaping behaviour repertoires.
Reinforcement
  • Increases the likelihood of behaviour.
  • Positive reinforcement: Adding a stimulus or event contingent upon a response increases that behaviour.
    • Lab: Provide food to a food-deprived rat for lever-pressing.
    • Life: Child receives pocket-money for doing "chores", affection of partner for "kind" act.
  • Negative reinforcement: Removing a stimulus or event contingent upon a response increases that behaviour.
    • Lab: A rat presses a lever to terminate (escape) or prevent (avoidance) an electric shock through the floor.
    • Life: Child does homework to avoid detention (or corporal punishment) at school.
Punishment
  • Decreases the likelihood of behaviour.
  • Positive punishment: Adding a stimulus or event contingent upon a response decreases that behaviour.
    • Lab: Rat receives electric shock for pressing a lever.
    • Life: Corporal punishment, antics on a skateboard.
  • Negative punishment: Removing a stimulus or event contingent upon a response decreases that behaviour.
    • Lab: A lever-press retracts the water spigot from a rat for a fixed period of time.
    • Life: Pocket-money withheld, not allowed to go out.
Conditioned Reinforcers and Punishers
  • Primary reinforcers or punishers seem inherently reinforcing (e.g., food) or punishing (pain).
  • Other stimuli acquire reinforcing or punishing properties by association with primary reinforcers or punishers.
  • Praise is a conditioned reinforcer. "No!" is a conditioned punisher.
  • A token reinforcer (money) can be exchanged for primary reinforcers.

Experimental Analysis of Behaviour

  • Systematic study of the relation between behaviour and its consequences.
  • The Skinner Box (Passer & Smith, p. 253)

Schedules of Reinforcement

  • A schedule of reinforcement is a specific pattern of presenting reinforcers over time (Passer & Smith, p. 256).
  • Continuous reinforcement (CRF): Every instance of a response is reinforced.
    • Useful for learning new behaviours and influencing ongoing patterns of behaviour quickly.
  • Partial (or intermittent) reinforcement: A designated response is reinforced only some of the time.
    • Useful for maintaining behaviours.
Four of the most simple Schedules
  • Ratio: Depends on numbers of responses (fixed and variable).
  • Interval: Depends largely on the passage of time (fixed and variable).
  • Fixed-ratio schedule: The reinforcer is given after a fixed number of non-reinforced responses.
    • Lab: A rat receives food for every tenth response (FR 10).
    • Cumulative record: Post-reinforcer pause, "burst" of responses until the next reinforcer.
  • Variable-ratio schedule: The reinforcer is given after a variable number of non-reinforced responses. The number of non-reinforced responses varies around a predetermined average.
    • Lab: A rat is reinforced for every 10th response on average, but the exact number required varies across trials.
    • Cumulative record: High steady rate of response, occasional pauses.
  • Fixed-interval schedule: The reinforcer is given for the first response after a fixed period of time has elapsed.
    • Lab: Rat is reinforced for the first lever press after 2 minutes has elapsed since the last reinforcer.
    • Cumulative record: Pause after reinforcement, steadily increasing response rate as interval elapses.
  • Variable-interval schedule: The reinforcer is given for the first response after a variable time interval has elapsed. The interval lengths vary around a predetermined average.
    • Lab: Rat is reinforced for the first lever press after 1' on average, but the interval length varies from trial to trial.
    • Cumulative record: High steady rate of response, although not quite as high as a comparable VR schedule, occasional pauses.
Applied Issue
  • Engineering compensation systems: Effects of commissioned versus wage payment (Gaetani et al., 1986, J. Org. Beh. Man).
  • Setting: 2 machinists - high-performance auto-machining shop.
    • Baseline, basic wages (5-$7 per hour).
    • Feedback phase.
    • Return to baseline.
    • Commissioned compensation with daily feedback and quality control. If worked at baseline rate, would still get equivalent of baseline "wage".
  • However, must avoid competition between workers, especially if there is a limited amount of work (waiters and trappers).
Extinction
  • Reinforcers are no longer delivered contingent upon a response, and the strength of the response decreases (Passer & Smith, p. 258).
Partial-Reinforcement Extinction Effect
  • Partial reinforcement schedules (FR, VR, FI, VI) provide greater resistance to extinction.
  • This resistance to extinction is useful from an applied perspective.
  • Why this partial reinforcement extinction effect?
    • A subject trained with partial reinforcement has learned that reinforcement follows non-reinforcement.
    • Learned to persist in the face of frustration produced by absence of reinforcement.
Applied Issue
  • The side-effects of extinction:
    • Extinction "bursts" of responses.
    • Extinction induced aggression.
    • Increase in response topography.
  • Typical problems using extinction:
    • Inability to control all sources of reinforcement for the behaviour.
    • Failure to provide alternative (appropriate) behaviours leading to the same reinforcer.
    • Spontaneous recovery.

Time Out

  • NOT extinction - negative punishment (removing a broad range of +ve reinforcers).
  • Brief, safe procedure, but not without potential problems.
    • Ability to implement.
    • Time out should remove reinforcers.
    • Is time in reinforcing?
    • Abuse because reinforcing for parent or teacher remove aversive stimulus.
Premack's Principle
  • What will be reinforcing?
  • Transituational reinforcement
    • Theory focuses on general causal stimuli.
    • Reinforcers and punishers form unique and independent sets of transituationally effective stimuli.
  • Consummatory responses (unreinforcible?).
  • Relatively long-term deprivation is necessary in order to use one of these gold standards.
A Premack Experiment
Part 1: Non-deprived rats
  • Measure baseline rates of running and drinking (BASE).
  • More running than drinking.
  • Now the running wheel only active for short periods following some drinking (FR 30).
  • Drinking increases.
Part 2: Deprive the rats of water
  • Baseline drinking now much higher (BASE).
  • Arrange it such that following drinking, the running wheel is activated, and the rat is forced to run (FR 15, FR 5).
  • The rate of drinking decreases.
Problems for the Idea of Transituational Reinforcers
  • In Part 1, "running" reinforces a consummatory response (drinking). This should not occur.

  • In Parts 1 and 2, wheel-running was both a reinforcer and a punisher of drinking. Transituational reinforcement doesn't allow this dual role.

  • Challenges concept of reinforcers as stimuli.

  • Instead, behaviours are characterised as either high probability or low probability. Behaviour is reinforced when it is followed by higher probability behaviours.

  • Can measure the probabilities of behaviours a priori, therefore should be able to predict what behaviour will reinforce other behaviours in a situation.

  • The probabilities of behaviours can vary from situation to situation or even as a function of time.

  • Major influence on behaviour modification.

    • Increases scope of what can be an effective reinforcer.
    • Procedures for identifying reinforcers and punishers are clear, yet relatively unobtrusive.
    • Reinforcers can be tailored for specific situations.
    • Deprivation a means of changing probabilities of certain behaviours in a situation.
Examples from Human Studies
Premack principle to control classroom behaviour of nursery school children (Homme et al., 1963)
  • High probability behaviours were running, screaming, pushing chairs, and doing jigsaws.
  • Low probability behaviours were sitting quietly and attending.
  • Sitting quietly was intermittently followed by the sound of a bell and the instruction "Run and scream". After a while, a signal to stop and engage in another low or high probability behaviour.
Two chronic schizophrenics: Coil stripping as part of their "industrial therapy"
  • Reinforcers such as cigarettes, sweets, or fruit had proved ineffective.
  • The high probability behaviour was sitting down doing nothing obvious. Being able to sit was made contingent on coil stripping.

Stimulus Control in Instrumental Conditioning

  • To be effective in the environment, instrumental responses must occur at appropriate occasions.
  • Antecedent stimulus control (cue, signal) instrumental behaviour.

Stimulus Generalisation and Discrimination

  • Extent that stimulus dimensions control behaviour

  • Effects of reinforcement and discrimination training

  • S+ 1000 Hz

  • S- 1000 Hz vs S- No Tone

  • -S+ 1000Hz vs S- 950 Hz

Generalisation of Punishment
  • VI Baseline rewards all key colours (wavelengths of light) equally.
  • Then add occasional punishment for responses to 550nm only.
Selective Stimulus Control
  • Environments are often complex - there's a tendency to ignore redundant information even if it's relevant.

Natural Categories or Concepts

  • Can non-human animals form categories or concepts from complex stimuli?
  • Variety of experiments using a variety of concepts:
    • Humans vs. no-humans
    • Pigeons (various breeds) vs. other birds
    • Trees vs. no-trees & Water vs. no-water
    • White oak leaves vs. other leaves.
  • Some arguments that these sorts of natural concepts were "genetically pre-programmed" in a species such as the pigeon.
  • Herrnstein & DeVilliers (1980).
  • Can pigeons learn the concept "fish"?
  • Demonstration of underwater slides with fish and without fish. These slides the pigeons found easy to discriminate.
  • Demonstration of other slides that pigeons found difficult to discriminate.
  • "Cognitive" performance of non-human animals?
  • Gives us a general understanding about behaviour, perception, and brain functioning in general.
  • Makes us consider what we think is special about human behaviour?

Biological Constraints on Instrumental Conditioning

The Misbehaviour of Organisms

  • Breland & Breland (1961) - using operant conditioning to train animals for advertising gimmicks.
Example 1
  • Train a raccoon to deposit tokens in a metal tin. Training them to pick up the coins was easy. But training them to deposit the coins (especially 2 coins) was a nightmare.
    • Didn't release coins.
    • "Now the raccoon really had problems. Not only could he not let go of the coins, but he spent seconds, even minutes, rubbing them together (in a most miserly fashion), and dipping them into the container. The rubbing became worse and worse as time went on."
Example 2
  • Training pigs to deposit large wooden 1 coins into a large "piggy" bank. Initially, no problem.
    • "Over a period of weeks the behaviour would become slower and slower. He might run over eagerly for each dollar, but on the way back, instead of carrying the dollar and depositing it simply and cleanly, he would repeatedly drop it, root it, drop it again, root it along the way, pick it up, toss it in the air, drop it, root it some more, and so on….Finally it would take the pig about 10 minutes to transport 4 coins a distance of about 6 feet."
  • After conditioning to a specific response, behaviour "drifted" to examples of instinctive behaviour related to food gathering.
  • Raccoons would rub the coins and dip them, while pigs would root the coins.
  • Importance of instinctive patterns, evolutionary history, and ecological niche for a complete understanding of behaviour.

Species-Specific Defense Reactions

  • Rats learn very easily to press lever for food.
  • Rats learn very easily to jump out of box, or over hurdle, to avoid electric shock.
  • However, it is extremely difficult to train rats to press lever to avoid shock?
  • Why? Fleeing and freezing are dominant responses for rats in defensive situations.
  • Again, there's a clash between instinct and operant.

Latent Learning

  • Learning from experience when there appears to be no obvious reinforcement or punishment for the specific behaviour.
  • Tolman and Hoznik (1930). Rats experience maze without food. Once food introduced, their learning of the maze was much quicker than control groups.

Observational Learning

  • Observational learning occurs when an organism's responding is influenced by the observation of others' behaviour (models).

Anecdotal Evidence

  • There is a variety of anecdotal evidence that social learning and mimicry takes place in a variety of species.
    • English titmouse
    • Parrots' mimicry of speech.
    • "Monkey see, monkey do".
    • Children

Evolutionary Rationale

  • As a mechanism for acquiring adaptive behaviours, it compares favorably with other learning mechanisms.
  • Speed of acquisition is quicker than trial and error learning.
  • It is generally advantageous, or at least harmless, to copy others than be innovative.
  • But is there evidence that animals show observational learning rather than “blind imitation” or “local enhancement”?
  • Palameta, B. & Lefebvre, L. (1985). Animal Behavior, 33, 892-896: Demonstrator bird trained to pierce the red-half of a piece of paper to obtain food.
Procedure Experiment 1
  • Each group has 5 subjects. 10 trials of 10 minutes.
    • Group NM (no model). Never observed the demonstrator.
    • Group BI (blind imitation). Observed demonstrator pierce paper, but no seed available.
    • Group LE (local enhancement). Observed demonstrator obtain seed, but paper pierced in advance.
    • Group OL (observational learning). Observed demonstrator pierce paper and eat seed.
  • NM (No Model): Never observed the demonstrator.
  • BI (Blind Imitation): Observed the demonstrator piercing the paper but no seed available.
  • LE (Local Enhancement): Observed the demonstrator obtain seed, but the paper pierced in advance.
  • OL (Observational Learning): Observed the demonstrator piercing the paper and eating the seed.
Procedure Experiment 2
  • Two groups, Deferred Observational Learning (DOL) and Deferred Local Enhancement (DLE), were trained as before but now tested in the model's absence.

Cook & Mineka, S. (1987). Journal of Abnormal Psychology, 98, 448-459

  • Tried to train fear of snakes in monkeys by observation.
Procedure
  • Videotape a monkey displaying fear of a boa constrictor and no-fear of artificial flowers. Edit the tapes.
  • Group 1: Monkeys see a monkey fear artificial flowers but show no fear of a toy snake.
  • Group 2: Monkeys see a monkey fear a toy snake but show no fear of artificial flowers.
  • Then test both groups. Monkeys must reach past a real snake, a toy snake, or artificial flowers to get food.
  • Fear of snakes learned by observation but also biological constraints – no fear of flowers learned.

Humans

  • Bandura et al. (1963, 1965) - aggression and "Bobo".
  • Implications for modelling aggression (e.g., TV).
  • Bandura et al. (1967) - children's fear of dogs.
    • Showing a boy playing fearlessly with a dog helped reduce fear.
Also, Importance of Consequences
  • Model Rewarded: higher number of imitative responses.

  • Model Punished: lower number of imitative responses.

  • No consequences: Intermediate number of imitative responses.

  • Albert Bandura's Four Key Processes Required for Observational Learning

    1. Attention: Extent to which we focus on others' behaviour
    2. Retention: Retaining a representation of others' behaviour
    3. Production: Ability to actually perform actions we observe
    4. Motivation: Need to perform actions we witness (use