Lecture on Reinforcement Schedules
Reinforcement and Punishment Contingencies
- Positive vs. Negative Contingencies:
- Positive contingency: Something is added (e.g., presenting food).
- Negative contingency: Something is removed (e.g., fining for speeding).
- Positive and negative do not mean good or bad; they indicate addition or subtraction.
- Reinforcement vs. Punishment:
- Reinforcement: Increases behavior.
- Punishment: Decreases behavior.
- Examples:
- Positive Punishment: Giving an electric shock reduces behavior by presenting something aversive.
- Contingency of being sent to prison: Aims to punish behavior but might inadvertently reinforce it.
Immediacy and contingencies
- Interventions might fail if consequences don't align with intended behaviors.
- Punishing a dog for returning after running away punishes the act of returning, not running away.
- Contingencies primarily affect the behavior immediately preceding them.
Barnabas Film Analysis
- Loss of support reduces behavior so it's punishment contingency.
- Loss of support: A negative punishment contingency as appetitive stimulus is withdrawn.
- Tone as Discriminative Stimulus:
- Tone signaled lever availability -> associated with food -> secondary positive reinforcement.
- When the tone stopped, Barnabas pressed the lever harder, showing an extinction burst.
Primary and Secondary Reinforcers
- Primary Reinforcers:
- Have innate biological significance (e.g., food, warmth, escape from pain).
- Secondary Reinforcers:
- Previously neutral stimuli paired with primary reinforcers (e.g., "good dog" paired with pats and food).
- Primary Punishers:
- Have a punishing effect due to innate biological significance (e.g., pain, cold).
- Secondary Punishers:
- Previously neutral stimuli that become punishing through association with primary punishers (e.g., shouting paired with physical punishment).
- "bad dog" when paired with punishment become secondary punisher.
Secondary Reinforcement and Behavior Maintenance
- Secondary reinforcement helps maintain behavior even when primary reinforcers are infrequent.
- Many behaviors are maintained by secondary reinforcers.
Intermittent Reinforcement
- Not every response needs to be reinforced to maintain behavior.
- Continuous Reinforcement: Every response is reinforced.
- Intermittent Reinforcement: Only some responses are reinforced, according to a schedule.
- Schedule of Reinforcement: A rule specifying which responses will be reinforced (e.g., fixed ratio schedule).
- Skinner's accidental discovery led to the study of intermittent reinforcement
- Skinner found that rats continued responding even when not reinforced every time.
- Daily rewards (paycheck) maintains most people's weekly work.
Partial Reinforcement Extinction Effect
- Behavior maintained by intermittent reinforcement persists longer during extinction than continuously reinforced behavior.
- Partial reinforcement produces more resilient behavior.
- Example: Rats in a runway
- Two groups: one reinforced every time (100%), one reinforced 30% of the time.
- The 30% group persisted longer during extinction.
- When reinforcement was taken away the behavior quickly extinguishes by the 100% reinforced group
Explaining the Partial Reinforcement Extinction Effect
- Transition to extinction is harder to detect with intermittent reinforcement.
- 100% group easily detects the change when reinforcement stops because reinforcement always happened.
- 30% group transition is harder to detect because they are used to not getting reward 70% of the time.
- Bedtime Tantrums:
- Parents attending to a child's tantrums reinforces the behavior.
- Extinction (ignoring tantrums) can initially increase the behavior (extinction burst).
- Inconsistent extinction (giving in sometimes) leads to intermittent reinforcement, making the behavior more resistant to extinction.
- Go back to continuous reinforcement again so as to make the transition to extinction easier to detect and this time stick with it, to break it.
Schedules of Reinforcement
- Ratio Schedules: Based on the number of responses emitted.
- Fixed Ratio (FR): Reinforce every response.
- Variable Ratio (VR): Reinforce every response on average.
- Interval Schedules: Based on the passage of time.
- Fixed Interval (FI): Reinforce the first response after a fixed time.
- Variable Interval (VI): Reinforce the first response after a variable time.
Cumulative Records
- Graphs showing the number of responses over time.
- Rate of responding is shown on the slope of the graph.
- Flat slope represents slow responding.
- Steep slope represents fast responding.
- Tick marks: indicate when reinforcers were delivered.
Fixed Interval Schedules
- Reinforce the first response after a fixed amount of time has passed since the last reinforcement.
- Example: FI 15-second schedule.
- Produce pauses after reinforcement, followed by increasing response rate as the interval ends.
- Two possible reasons for for pauses after reinforcement.
- Taking time to consume the reinforcement.
- Knowing there will not be the reinforcer for 15 seconds.
- Variable interval schedule would allow to test the two possible results.
Variable Interval Schedules
- Responses reinforced when a variable amount of time has elapsed since the last reinforcer.
- Example: VI 30-second schedule.
- If reasons for pauses in the fixed interval schedules is predicting the next reinforcer then they shouldnt pause.
- If the result is that the pause where lost, it proves that the second explanation was right
- Produce constant rate of responding.
- Reinforcers in variable interval schedule are randomly spaced in the axis.
Differential Reinforcement of Other Behavior (DRO)
- Reinforcer delivered when a fixed amount of time has elapsed since the last response.
- DRO schedule explicitly reinforces non-responding.
- Omission training
- DRO5 second schedule example
- Every time the individual is responding, that takes away the reinforcer so it does a negative punishment and positive reinforcement.
- Example: Using DRO to stop a cat from meowing by slowly having it quiet down without jumping to put the food down
Fixed Time Schedules
- Reinforcer is delivered regardless of a response when a fixed amount of time has elapsed since the last reinforcer.
- No requirement for a response.
- Non contingent reinforcement.
- Also calle response independent reinforcement
- Skinner arranged fixed time 15 second schedules.
- Each pigeon did the same thing over and over because they must of been doing something when the first reinforcer was delivered.
- Pigeons are being superstitious even thought their action does not cause the reinforcer.
The Law of Effect and Superstition
- Whatever they were doing immediately before the reinforcer should be reinforced, and therefore likely to happen again, happen more in the future.
- It doesn't say that that behavior has to cause the reinforcer
- There is no difference between behavior that causes a reinforcer to happen and behavior that just happens to be followed by a reinforcer by chance.
- Reinforces like this, happened by chance are called adventitious reinforcement.
- Accidental reinforcement
- A way to understand the mechanism experimentally, for the superstitions.
Superstition in Humans
Doing superstition experimentation with human gives similar results (Wagner and Morris experiment 1987)
- Robot mechanical clown, giving sweets.
- The kids would go make faces, gave the clown a kiss
Superstitions are more about preventing a bad thing from happening.
Third light superstition.
- Avoid being the third person to light a cigarette from the same match to save yourself from getting shot by a sniper.
Everything is maintained by adventitious reinforcement.