Lecture on Reinforcement Schedules

Reinforcement and Punishment Contingencies

  • Positive vs. Negative Contingencies:
    • Positive contingency: Something is added (e.g., presenting food).
    • Negative contingency: Something is removed (e.g., fining for speeding).
    • Positive and negative do not mean good or bad; they indicate addition or subtraction.
  • Reinforcement vs. Punishment:
    • Reinforcement: Increases behavior.
    • Punishment: Decreases behavior.
  • Examples:
    • Positive Punishment: Giving an electric shock reduces behavior by presenting something aversive.
    • Contingency of being sent to prison: Aims to punish behavior but might inadvertently reinforce it.

Immediacy and contingencies

  • Interventions might fail if consequences don't align with intended behaviors.
    • Punishing a dog for returning after running away punishes the act of returning, not running away.
  • Contingencies primarily affect the behavior immediately preceding them.

Barnabas Film Analysis

  • Loss of support reduces behavior so it's punishment contingency.
  • Loss of support: A negative punishment contingency as appetitive stimulus is withdrawn.
  • Tone as Discriminative Stimulus:
    • Tone signaled lever availability -> associated with food -> secondary positive reinforcement.
    • When the tone stopped, Barnabas pressed the lever harder, showing an extinction burst.

Primary and Secondary Reinforcers

  • Primary Reinforcers:
    • Have innate biological significance (e.g., food, warmth, escape from pain).
  • Secondary Reinforcers:
    • Previously neutral stimuli paired with primary reinforcers (e.g., "good dog" paired with pats and food).
  • Primary Punishers:
    • Have a punishing effect due to innate biological significance (e.g., pain, cold).
  • Secondary Punishers:
    • Previously neutral stimuli that become punishing through association with primary punishers (e.g., shouting paired with physical punishment).
    • "bad dog" when paired with punishment become secondary punisher.

Secondary Reinforcement and Behavior Maintenance

  • Secondary reinforcement helps maintain behavior even when primary reinforcers are infrequent.
    • Many behaviors are maintained by secondary reinforcers.

Intermittent Reinforcement

  • Not every response needs to be reinforced to maintain behavior.
  • Continuous Reinforcement: Every response is reinforced.
  • Intermittent Reinforcement: Only some responses are reinforced, according to a schedule.
  • Schedule of Reinforcement: A rule specifying which responses will be reinforced (e.g., fixed ratio schedule).
  • Skinner's accidental discovery led to the study of intermittent reinforcement
    • Skinner found that rats continued responding even when not reinforced every time.
    • Daily rewards (paycheck) maintains most people's weekly work.

Partial Reinforcement Extinction Effect

  • Behavior maintained by intermittent reinforcement persists longer during extinction than continuously reinforced behavior.
  • Partial reinforcement produces more resilient behavior.
  • Example: Rats in a runway
    • Two groups: one reinforced every time (100%), one reinforced 30% of the time.
    • The 30% group persisted longer during extinction.
      • When reinforcement was taken away the behavior quickly extinguishes by the 100% reinforced group

Explaining the Partial Reinforcement Extinction Effect

  • Transition to extinction is harder to detect with intermittent reinforcement.
    • 100% group easily detects the change when reinforcement stops because reinforcement always happened.
    • 30% group transition is harder to detect because they are used to not getting reward 70% of the time.
  • Bedtime Tantrums:
    • Parents attending to a child's tantrums reinforces the behavior.
    • Extinction (ignoring tantrums) can initially increase the behavior (extinction burst).
    • Inconsistent extinction (giving in sometimes) leads to intermittent reinforcement, making the behavior more resistant to extinction.
    • Go back to continuous reinforcement again so as to make the transition to extinction easier to detect and this time stick with it, to break it.

Schedules of Reinforcement

  • Ratio Schedules: Based on the number of responses emitted.
    • Fixed Ratio (FR): Reinforce every nthn^{th} response.
    • Variable Ratio (VR): Reinforce every nthn^{th} response on average.
  • Interval Schedules: Based on the passage of time.
    • Fixed Interval (FI): Reinforce the first response after a fixed time.
    • Variable Interval (VI): Reinforce the first response after a variable time.

Cumulative Records

  • Graphs showing the number of responses over time.
  • Rate of responding is shown on the slope of the graph.
    • Flat slope represents slow responding.
    • Steep slope represents fast responding.
  • Tick marks: indicate when reinforcers were delivered.

Fixed Interval Schedules

  • Reinforce the first response after a fixed amount of time has passed since the last reinforcement.
  • Example: FI 15-second schedule.
  • Produce pauses after reinforcement, followed by increasing response rate as the interval ends.
  • Two possible reasons for for pauses after reinforcement.
    • Taking time to consume the reinforcement.
    • Knowing there will not be the reinforcer for 15 seconds.
  • Variable interval schedule would allow to test the two possible results.

Variable Interval Schedules

  • Responses reinforced when a variable amount of time has elapsed since the last reinforcer.
  • Example: VI 30-second schedule.
  • If reasons for pauses in the fixed interval schedules is predicting the next reinforcer then they shouldnt pause.
  • If the result is that the pause where lost, it proves that the second explanation was right
  • Produce constant rate of responding.
  • Reinforcers in variable interval schedule are randomly spaced in the x\textit{x} axis.

Differential Reinforcement of Other Behavior (DRO)

  • Reinforcer delivered when a fixed amount of time has elapsed since the last response.
  • DRO schedule explicitly reinforces non-responding.
  • Omission training
  • DRO5 second schedule example
  • Every time the individual is responding, that takes away the reinforcer so it does a negative punishment and positive reinforcement.
  • Example: Using DRO to stop a cat from meowing by slowly having it quiet down without jumping to put the food down

Fixed Time Schedules

  • Reinforcer is delivered regardless of a response when a fixed amount of time has elapsed since the last reinforcer.
  • No requirement for a response.
  • Non contingent reinforcement.
  • Also calle response independent reinforcement
  • Skinner arranged fixed time 15 second schedules.
  • Each pigeon did the same thing over and over because they must of been doing something when the first reinforcer was delivered.
  • Pigeons are being superstitious even thought their action does not cause the reinforcer.

The Law of Effect and Superstition

  • Whatever they were doing immediately before the reinforcer should be reinforced, and therefore likely to happen again, happen more in the future.
  • It doesn't say that that behavior has to cause the reinforcer
  • There is no difference between behavior that causes a reinforcer to happen and behavior that just happens to be followed by a reinforcer by chance.
  • Reinforces like this, happened by chance are called adventitious reinforcement.
  • Accidental reinforcement
  • A way to understand the mechanism experimentally, for the superstitions.

Superstition in Humans

  • Doing superstition experimentation with human gives similar results (Wagner and Morris experiment 1987)

    • Robot mechanical clown, giving sweets.
    • The kids would go make faces, gave the clown a kiss
  • Superstitions are more about preventing a bad thing from happening.

  • Third light superstition.

    • Avoid being the third person to light a cigarette from the same match to save yourself from getting shot by a sniper.
  • Everything is maintained by adventitious reinforcement.