Schedules and Theories of Reinforcement Notes

Schedules and Theories of Reinforcement

Types of Positive Reinforcement

  • Positive reinforcement can be further distinguished by:

    • Immediate vs. delayed

    • Primary vs. secondary

    • Intrinsic vs. extrinsic

    • Natural vs. contrived

Immediate vs. Delayed Reinforcement

  • The more immediate the reinforcer, the stronger its effect on the behavior.

  • Example:

    • Giving a treat to a child who is playing quietly while she is still playing quietly reinforces quiet playing.

  • The benefits of exercise and proper eating are delayed and therefore weak.

Primary Reinforcers

  • A primary reinforcer is an event that is innately reinforcing.

  • Examples:

    • Food, water, proper temperature, sexual contact.

Secondary Reinforcers

  • A secondary reinforcer is an event that is reinforcing because it has been associated with some other reinforcer.

  • Examples:

    • Good marks, fine clothes, a nice car.

  • Conditioned stimuli that have been associated with appetitive unconditioned stimuli (USs) can also function as secondary reinforcers.

  • Example:

    • A metronome that has been associated with food can be a secondary reinforcer for the operant response of lever pressing.

Generalized Reinforcer

  • A type of secondary reinforcer that has been associated with several other reinforcers.

  • Example:

    • Money is associated with food, clothing, furnishings, entertainment, and even dates.

    • Social attention is associated with food, play, and comfort.

  • A “token economy” is when tokens are given as reinforcers. The tokens can be traded for other rewards later.

Intrinsic Reinforcement

  • Reinforcement provided by the mere act of performing the behavior.

  • Examples:

    • Rollerblading is invigorating.

    • Attending parties is fun.

Extrinsic Reinforcement

  • The reinforcement provided by some consequence that is external to the behavior.

  • Examples:

    • You drive to get somewhere.

    • You work for money.

    • You date an attractive individual merely to enhance your prestige.

Natural Reinforcers

  • Reinforcers that are naturally provided for a certain behavior.

  • They are a typical consequence of the behavior within that setting.

  • Examples:

    • Money is a natural consequence of selling merchandise.

    • Gold medals are a natural consequence of hard training and a great performance.

Contrived Reinforcers

  • Reinforcers that have been deliberately arranged to modify a behavior.

  • They are not a typical consequence of the behavior in that setting.

  • Examples:

    • Turning on the television is a contrived reinforcer for studying.

Continuous Reinforcement Schedule

  • Abbreviated as CRF

  • Each instance of the specified behavior is reinforced.

  • Particularly useful in teaching new behavior.

  • Speeds up conditioning.

  • Used frequently in shaping procedures.

  • Produces responding that has a low resistance to extinction but a high extinction burst.

Intermittent Reinforcement Schedule

  • Only some responses are reinforced.

  • Particularly useful in maintaining new behavior.

  • Used frequently in maintenance procedures.

  • Produces responding that has a high resistance to extinction but a low extinction burst.

4 Basic Intermittent Schedules

  • Fixed ratio

  • Variable ratio

  • Fixed interval

  • Variable interval

Fixed Ratio Schedules

  • Fixed Ratio: reinforcement is contingent on a fixed number of responses.

  • Abbreviated FR, e.g., FR5, FR500

  • Produce a high rate of responses and a short post-reinforcement pause.

  • See figure 7.1

  • Ratio strain: a disruption in performance due to an overwhelming response requirement.

Responses to FR Schedules

  • Usually a high rate of response with a short postreinforcement pause.

  • Example:

    • On an FR 25 schedule, a rat rapidly emits 25 lever presses, munches down the food pellet it receives, and then snoops around before emitting more lever presses.

  • A postreinforcement pause is a short pause following the attainment of each reinforcer.

  • Higher ratio requirements produce longer postreinforcement pauses.

  • Example:

    • You will take a longer break after completing a long assignment than after completing a short one.

  • There may be little or no pausing with an FR1 or FR2 schedule because the reinforcer is so close.

Variable Ratio Schedules

  • Variable Ratio: reinforcement is contingent on a variable number of responses.

  • Abbreviated VR, e.g., VR50, VR500

  • Produce a high rate of responses and almost NO post-reinforcement pause.

  • See figure 7.1

  • Ratio strain: a disruption in performance due to an overwhelming response requirement.

Real Examples of VR Schedules

  • Only some of a cheetah’s attempts at chasing down prey are successful.

  • Only some acts of politeness receive acknowledgment.

  • Only some songs that we download are enjoyable.

Fixed Interval Schedules

  • Fixed Interval: reinforcement is contingent on the first response after a fixed period of time.

  • Abbreviated FI20. Note that here the number refers to the duration of the interval and not the number of responses.

  • Produce scalloped performance - timing.

  • See figure 7.1

Responses to FI Schedules

  • Responses consist of a postreinforcement pause followed by a gradually increasing rate of response as the interval draws to a close.

  • Example:

    • A rat on an FI 30-sec schedule will emit no lever presses at the start of the 30-second interval, but the rate of responses will gradually increase as the 30 seconds end.

  • Your study habits this term were probably very weak at the beginning of the semester, and they will increase as the semester draws to a close.

Variable Interval Schedules

  • Variable Interval: reinforcement is contingent on the first response after a variable period of time.

  • Abbreviated VI (e.g., VI 20). Note that here the number refers to the duration of the interval and not the number of responses.

  • Produce a moderate, steady rate of performance with little or no post-reinforcement pause - timing.

  • See figure 7.1 and table 7.1

Response Rate Schedules

  • Differential Reinforcement of High Rates (DRH). Reinforcement is contingent upon emitting at least a certain number of responses in a given time.

  • Example: a salesman has to make at least 15 sales per week to earn his bonus.

  • Differential Reinforcement of Low Rates (DRL). A minimum amount of time must pass between responses in order for reinforcement to be delivered.

  • Example: a teacher will only call on her student if there's an interval of at least 5 minutes between the times he raises his hand.

Noncontingent Schedules

  • Fixed Time (FT). Reinforcement is delivered after a fixed interval of time, regardless of the organism’s behavior.

  • Example:

    • On a fixed time 30-second (FT 30-sec) schedule, a pigeon receives food every 30 seconds regardless of its behavior.

    • People receive Christmas gifts each year regardless of their behavior on an FT 1-year schedule.

  • Variable Time (VT). Reinforcement is delivered after a variable interval of time, regardless of the organism's behavior.

  • Example:

    • A pigeon receives food after an average interval of 30 seconds (VT 30-sec schedule).

Superstitious Behavior

  • Skinner: Superstition in the Pigeon experiments (1957).

  • Noncontingent reinforcement may account for some forms of superstitious behavior.

  • Behaviors may be accidentally reinforced by the coincidental presentation of reinforcements.

  • This is also true of athletes and gamblers.

  • Unusual events that precede a fine performance may be quickly identified and then deliberately reproduced in the hopes of reproducing that performance.

  • Superstitious behavior can be seen as an attempt to make an unpredictable situation more predictable.

Complex Schedules of Reinforcement

  • A combination of two or more simple schedules.

  • They include:

    • Conjunctive Schedules

    • Adjusting Schedules

    • Chained Schedules

    • Multiple Schedules

    • Concurrent Schedules

Complex Schedules

  • Conjunctive: Requirements of two or more schedules must be met before reinforcement is delivered.

  • Get paid for every item you make. Stay at work for 8 hours actually working. FR + FI

  • Most real-life schedules are conjunctive.

  • Adjusting: Response requirement changes as a function of the organism's behavior while responding on the preceding schedule (working for a previous reinforcer).

  • Example: May start with FR5, then move up to FR 50, etc.

  • Chained: Two or more simple schedules where each link has its own SD/Sr+ and the last response produces a terminal reinforcer that reinforces the entire chain.

  • Green key, peck --> red key, peck --> food.

  • Goal gradient effect: increase in the strength and / or efficiency of behavior as a function of the approaching reinforcement. That's why you are more likely to become distracted when you're starting your paper. Secondary reinforcement is likely needed.

  • Examples: working more efficiently near pay time, taking shorter breaks when you're almost done with an essay.

Theories of Reinforcement

  • Highly influential theory to this day, developed by David Premack (1965):

  • Premack Principle: behaviors associated with experience with reinforcement are viewed as reinforcers, rather than stimuli.

  • According to the Premack principle, behavior is reinforced by eating the food, drinking the water, riding a bike, not by food, water, or bike.

  • High probability behaviors can be used as reinforcers for low probability behaviors (e.g., running the wheel --> drinking water).

Response Deprivation Hypothesis (Timberlake & Allison, 1974)

  • Extension of the Premack Theory.

  • Behavior can serve as a reinforcer if:

    • Access to the behavior is restricted.

    • Its frequency falls below the preferred levels of occurrence.

  • Example:

    • A rat typically runs for 1 hour a day whenever it has free access to a running wheel.

    • If the rat is then allowed free access to the wheel for only 15 minutes per day, it will be unable to reach this preferred level.

    • The rat will be in a state of deprivation with regard to running.

    • The rat will now be willing to work to obtain additional time on the wheel.