Lecture on Reinforcement Schedules

Positive vs. Negative Contingencies:
- Positive contingency: Something is added (e.g., presenting food).
- Negative contingency: Something is removed (e.g., fining for speeding).
- Positive and negative do not mean good or bad; they indicate addition or subtraction.
Reinforcement vs. Punishment:
- Reinforcement: Increases behavior.
- Punishment: Decreases behavior.
Examples:
- Positive Punishment: Giving an electric shock reduces behavior by presenting something aversive.
- Contingency of being sent to prison: Aims to punish behavior but might inadvertently reinforce it.

Interventions might fail if consequences don't align with intended behaviors.
- Punishing a dog for returning after running away punishes the act of returning, not running away.
Contingencies primarily affect the behavior immediately preceding them.

Loss of support reduces behavior so it's punishment contingency.
Loss of support: A negative punishment contingency as appetitive stimulus is withdrawn.
Tone as Discriminative Stimulus:
- Tone signaled lever availability -> associated with food -> secondary positive reinforcement.
- When the tone stopped, Barnabas pressed the lever harder, showing an extinction burst.

Primary Reinforcers:
- Have innate biological significance (e.g., food, warmth, escape from pain).
Secondary Reinforcers:
- Previously neutral stimuli paired with primary reinforcers (e.g., "good dog" paired with pats and food).
Primary Punishers:
- Have a punishing effect due to innate biological significance (e.g., pain, cold).
Secondary Punishers:
- Previously neutral stimuli that become punishing through association with primary punishers (e.g., shouting paired with physical punishment).
- "bad dog" when paired with punishment become secondary punisher.

Secondary reinforcement helps maintain behavior even when primary reinforcers are infrequent.
- Many behaviors are maintained by secondary reinforcers.

Not every response needs to be reinforced to maintain behavior.
Continuous Reinforcement: Every response is reinforced.
Intermittent Reinforcement: Only some responses are reinforced, according to a schedule.
Schedule of Reinforcement: A rule specifying which responses will be reinforced (e.g., fixed ratio schedule).
Skinner's accidental discovery led to the study of intermittent reinforcement
- Skinner found that rats continued responding even when not reinforced every time.
- Daily rewards (paycheck) maintains most people's weekly work.

Behavior maintained by intermittent reinforcement persists longer during extinction than continuously reinforced behavior.
Partial reinforcement produces more resilient behavior.
Example: Rats in a runway
- Two groups: one reinforced every time (100%), one reinforced 30% of the time.
- The 30% group persisted longer during extinction.
  - When reinforcement was taken away the behavior quickly extinguishes by the 100% reinforced group

Transition to extinction is harder to detect with intermittent reinforcement.
- 100% group easily detects the change when reinforcement stops because reinforcement always happened.
- 30% group transition is harder to detect because they are used to not getting reward 70% of the time.
Bedtime Tantrums:
- Parents attending to a child's tantrums reinforces the behavior.
- Extinction (ignoring tantrums) can initially increase the behavior (extinction burst).
- Inconsistent extinction (giving in sometimes) leads to intermittent reinforcement, making the behavior more resistant to extinction.
- Go back to continuous reinforcement again so as to make the transition to extinction easier to detect and this time stick with it, to break it.

Ratio Schedules: Based on the number of responses emitted.
- Fixed Ratio (FR): Reinforce every $n^{th}$ response.
- Variable Ratio (VR): Reinforce every $n^{th}$ response on average.
Interval Schedules: Based on the passage of time.
- Fixed Interval (FI): Reinforce the first response after a fixed time.
- Variable Interval (VI): Reinforce the first response after a variable time.

Graphs showing the number of responses over time.
Rate of responding is shown on the slope of the graph.
- Flat slope represents slow responding.
- Steep slope represents fast responding.
Tick marks: indicate when reinforcers were delivered.

Reinforce the first response after a fixed amount of time has passed since the last reinforcement.
Example: FI 15-second schedule.
Produce pauses after reinforcement, followed by increasing response rate as the interval ends.
Two possible reasons for for pauses after reinforcement.
- Taking time to consume the reinforcement.
- Knowing there will not be the reinforcer for 15 seconds.
Variable interval schedule would allow to test the two possible results.

Responses reinforced when a variable amount of time has elapsed since the last reinforcer.
Example: VI 30-second schedule.
If reasons for pauses in the fixed interval schedules is predicting the next reinforcer then they shouldnt pause.
If the result is that the pause where lost, it proves that the second explanation was right
Produce constant rate of responding.
Reinforcers in variable interval schedule are randomly spaced in the $\textit{x}$ axis.

Reinforcer delivered when a fixed amount of time has elapsed since the last response.
DRO schedule explicitly reinforces non-responding.
Omission training
DRO5 second schedule example
Every time the individual is responding, that takes away the reinforcer so it does a negative punishment and positive reinforcement.
Example: Using DRO to stop a cat from meowing by slowly having it quiet down without jumping to put the food down

Reinforcer is delivered regardless of a response when a fixed amount of time has elapsed since the last reinforcer.
No requirement for a response.
Non contingent reinforcement.
Also calle response independent reinforcement
Skinner arranged fixed time 15 second schedules.
Each pigeon did the same thing over and over because they must of been doing something when the first reinforcer was delivered.
Pigeons are being superstitious even thought their action does not cause the reinforcer.

Whatever they were doing immediately before the reinforcer should be reinforced, and therefore likely to happen again, happen more in the future.
It doesn't say that that behavior has to cause the reinforcer
There is no difference between behavior that causes a reinforcer to happen and behavior that just happens to be followed by a reinforcer by chance.
Reinforces like this, happened by chance are called adventitious reinforcement.
Accidental reinforcement
A way to understand the mechanism experimentally, for the superstitions.

Doing superstition experimentation with human gives similar results (Wagner and Morris experiment 1987)
- Robot mechanical clown, giving sweets.
- The kids would go make faces, gave the clown a kiss
Superstitions are more about preventing a bad thing from happening.
Third light superstition.
- Avoid being the third person to light a cigarette from the same match to save yourself from getting shot by a sniper.
Everything is maintained by adventitious reinforcement.