Chapter 5: Conditioning Schedules of Reinforcement

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/46

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

47 Terms

New cards

Schedules of Reinforcemnt

Contingent relations between the response and the consequence
Rule that determines when a response will be followed by a reinforcer (includes discriminative stimulus - where)

New cards

Continuous Reinforcement (CRF)

Each and every response produces a reinforcer
Continues until satiation
Typically used for acquisition of new behaviours (shaping)
Conjugate reinforcement

New cards

Conjugate Reinforcement

Properties of reinforcement (rate, acquisition, intensity) are tied to particular dimensions of a response
- More effortful response results in a proportionally larger reinforcement

New cards

Response Stereotypy on CRF

Topography of response becomes stereotypical on CRF; only see variability with extinction
Response variability may vary inversely with rate of reinforcement
Organism becomes more variable in responding as reinforcement becomes less frequent or predictable

New cards

Intermittent Reinforcement

Not every response produces a reinforcer

New cards

Advantages of Intermittent Schedules:

Less likely to produce satiation
Produce higher rates of responding
Maintain behaviour for a longer period of time
Maintain behaviour longer under extinction

New cards

Fixed Ratio Schedule

Reinforcement occurs after a fixed number of responses (ratio) (e.g., FR10, every 10th response produces a reinforcer)
Produces a pattern of responding known as”Break and RUn” or “Stop and Go”
Break and Run
Ratio Strain

New cards

Break and Run

Following reinforcement, a period of no responding (post-reinforcement pause, PRP) followed by a rapid run of responses (Ratio run)

New cards

Ratio Strain

If ratio requirement increases too rapidly or increment i too large, animals begin to pause for long periods before completing the ratio requirement (e.g., FR10 to FR100 to FR1000)

New cards

Explanations for PRP

Skinner: occurrence of reinforcement becomes an S^ for non-reinforcement leading animal to pause
But… duration of the PRP increases as the ratio requirement increases (inconsistent with Skinner’s explanation)
Upcoming ratio requirement plays a greater role in determining PRP
On a multiple FR10 FR100 schedule (an FR10 schedule alternates with an FR100 schedule; with FR10 signalled by a red light and FR100 by a blue light)
If the PRP is due to fatigue, then should always see a long pause after FR100 and a sort pause after FR10
Instead there is a relation between duration of PRP and stimulus signalling the upcoming ratio
Pause is longer when the stimulus is blue (predicts FR100) and shorter when the stimulus is red (predicts FR10)
Consistent with remaining responses hypothesis: PRP is a period furthest from the next reinforcer and immediately following the previous reinforcer when the tendency to respond is weakest
Tendency is weaker when the number of upcoming responses is greater
“Pre ratio rather than “post reinforcement” pause

New cards

Variable Ratio Schedule

Reinforcement occurs after a number of responses; but that number varies from reinforcer (e.g., VR10, average number of responses is 10, but varies between 1 and 40 responses)
Response requirement is unpredictable
Produces the highest rate of response
Very short PRP; occasional occurrence of a reinforcer after a small number of responses (short run) reduces the likelihood of pausing

New cards

Fixed Interval Schedule

The first response after a fixed interval of time has elapsed produces reinforcement
Produces a pattern of responding known as scalloping (also see break and run)
Do not confuse with a fixed time (FT) schedule - reinforcement is delivered after a fixed period of time without any response required
FI schedules like timetables for trains or buses, but with a limited hold

New cards

Scalloping

Pause following reinforcement (~1/2 of interval), then a acceleration in responding up to the time when a response produces a reinforcer

New cards

Human vs Animal Performance on Fixed Interval Schedule

Rats/pigeons show scallop; humans show either steady high rate or low rate break and run
Difference: language? - human performance follows self-generated rules
Implies: preverbal humans should show characteristic schedule effects
Lowe, Beasty, & Bentham (1983) showed that infants that were not verbally skilled produced a scallop pattern on FI schedule
Confound: adult humans have greater experience with ratio schedules; history may affect performance on FI schedules
Wanchisen, Tatham, & Mooner (1989) exposed rats to VR before FI schedules; rats showed either a high rate pattern or a low rate break and run pattern

New cards

Variable interval Schedule

First responses after an interval of time has elapsed produces reinforcement; however, the interval of time that must elapse varies from reinforcer to reinforcer (e.g., VI10 s, produces reinforcement on average every 10s; however, the intervals vary between 1 s and 40 s)
Reinforcement requirement is unpredictable
Produces a steady moderate rate of responding
Pause after reinforcement typically does not occur
Steady response rate during extinction
Frequently used to produce a baseline rate of responding to evaluate the effect of independent variables
Moderate rate can increase or decrease

New cards

Response-Reinforcer Correlation Theory

Molar theory based on feedback functions
On a VR schedule, there is a linear relationship between response rate and reinforcement rate
Feedback: faster your respond, the quicker the reinforcers come
On a VI schedule, as response rate initially increases rate of reinforcement increases until it reaches the point at which each reinforcement is being picked up as soon as it is set up, beyond this point, further increase in response rate produce no changes in reinforcement rate
Maximum reinforcement rate determined by the schedule (VI 30 s max 120 reinforcers/hour)
Feedback: respond faster, but reinforcers do not occur quicker

New cards

Response Rate

How quickly you respond

New cards

Reinforcement Rate

How quickly reinforcers are delivered

New cards

IRT Reinforcement Theory

Molecular theory
- On a VI schedule, probability that a response will produce reinforcement increases as the duration of the time between responses [inter response time (IRT)] increases
- Probability that a response will produce reinforcement is higher for longer IRTs than for shorter IRTs
- Selectively strengthens long pauses between successive responses
- On a VR schedule, probability that a response will produce reinforcement is constant and independent of IRT
- Each response has the same probability of paying off since the schedule does not advance unless a response is made (does not advance with time)
- Probability = 1/VR
- Tendency for the animal to respond in bursts my lead to strengthening of short IRTs

New cards

Evidence

VR+ schedule: molar properties of VR schedule, but molecular properties of VI schedule
Humans pressed at high rate on both VR and VI+ schedule (evidence for molar)
Rats showed response rates on VI+ schedule similar to VI schedule when matched for reinforcement rate (evidence for molecular)
Not “either or”; sensitivity to molar vs molecular depends on response rate
Low response rate contacts molecular contingencies related to IRTs while high response rate contacts molar contingencies in terms of the correlation between response and reinforcement rates

New cards

Progressive Ratio Schedule

Reinforcement occurs after a number of responses (ratio) with the required number of responses increasing systematically after each reinforcement
Arithmetic progression
Geometric progressions
Breakpoint
Breakpoint is used as an index of reinforcement efficacy
Comparison of breakpoints shows the relative reinforcement efficacy of different drugs as well as different doses on the same drug

New cards

Arithmetic Progression

Increment is constant (5, 10, 20, 40, 80, 160, 320 responses)

New cards

Geometric Progressions

Ratio determined by multiplying previous ratio by a fixed number (5, 10, 20, 40, 80, 160, 320 responses)

New cards

Breakpoint

Highest ratio value completed, before animal failed to complete

New cards

PRPs on FI Schedules

PRP duration varies with inter reinforcement interval (IRI), as IRI gets longer, PRP gets longer
PRP is ~1/2 the IRI
PRP on FR also increases as the ratio increases; however, IRI is partly determined by the animal (high response rate - shorter IRI, low response rate - longer IRI)
On FR, rate of responses increases as ratio initially increases

New cards

Molar Account of PRP

Animals seek to maximize overall reinforcement rate (maximum reinforcement for least amount of effort)
PRP ~1/2 interval duration; if longer than 1/2, then overall Rae decreases as animal pauses longer than reinforcement interval more often, if shorter than ½ then animal making more responses for same overall reinforcement

New cards

Molecular Account of PRP

Animals obtain automatic reinforcement from engaging in other behaviours (e.g., grooming, sniffing, scratching) during the PRP
1. Maximize total reinforcement from both sources (extrinsic reinforcement from lever pressing and automatic reinforcement from schedule-induced behaviours early in the interval)
High rate of responding for a prior reinforcement reduces the value of a subsequent reinforcement
1. Reduced value leads to a longer pause before initiating responding

New cards

Response Rate Schedules

DRL (Differential Reinforcement of a Low Rate of Responding)
DRH (Differential Reinforcement of a High Rate of Responding)

New cards

DRL (Differential Reinforcement of a Low Rate of Responding)

Reinforcement occurs if interval of time between successive responses > X
DRL 10 s: every response that occurs 10 seconds after the last response is reinforced; if a response occurs less than 10 s after the last response, no reinforcer occurs
Generates low rates of responding; average pause is slightly shorter than the required interval and about 50% of responses go unreinforced

New cards

DRH (Differential Reinforcement of a High Rate of Responding)

Reinforcement occurs if a certain number of responses occur within a fixed amount of time
Example: Reinforcer only occurs if the animal has made 10 responses in 10 s or less
Generates high rates of responding

New cards

Concurrent Schedules

Subject is presented with two or more response alternatives, each associated with its own reinforcement schedule
Both schedules operate simultaneously
Dependent measure = relative allocation of time and behaviour to each alternative
Used to study choice

New cards

Chain Schedules

Sequence of reinforcement schedules where completion of previous schedule produces opportunity to respond on the next schedule
E.g., Chain (White) FI 30 s (Green) FR 10 - Food
Transition from one schedule to another is signalled by a change in a stimulus (e.g., light over lever changes white to green)
Change in stimulus serves as a conditioned reinforcer than maintains responding in the previous link (terminal reinforcer [food] conveys value to this stimulus)

New cards

Delay

General rule: the more immediately a reinforcer occurs following a behaviour, the more effective the consequence for strengthening that behaviour [contiguity]
Delay decreases the effectiveness of a reinforcer

New cards

Reason 1

During the delay, other behaviours occur that are also reinforced by the consequence
- What is reinforced is another behaviour plus the target behaviour rather than just the target behaviour alone. This weakens the strengthening effect of reinforcement on the target behaviour

New cards

Reason 2

Value of a reinforcing consequence decreases with delay
- Example: if I offer you the choice between $100 tomorrow or $100 a month from tomorrow, which would you choose?

New cards

Equation describes the decrease in value as a function of delay is:

Value = Amount / 1+K (Delay)
- K refers to a temporal discounting rate
- Large value of K means that the rate of temporal discounting is high, that is, value declines rapidly with delay
- Small values of K means that the rate of temporal discounting is low, that is, value declines slowly with delay
- K varies across species (pigeons have high values of K, people have low values)

New cards

Magnitude or Amount

Magnitude of reinforcement effect: the larger the reinforcer, the higher the rate of responding maintained by a reinforcer
Exception: Paradoxical Incentive Effect - Bizo et al. showed that rats will respond at higher rates when the reinforcer is 1 food pellet than when it is 2 pellets
Propose: larger magnitude leads to less efficient coupling of the consequence to the response

New cards

Effectiveness of Reinforcement Magnitude also Depends on:

Type of reinforcer
Effort

New cards

Type of Reinforcer

With consumable reinforcers, larger reinforcers will produce satiation more quickly than with non consumable reinforcers (e.g., money)

New cards

Effort

It takes more reinforcement to maintain a more effortful response than a less effortful response

New cards

Motivational Operation

Satiation
Deprivation

New cards

Satiation

Decrease in responding for a reinforcer as a function of recent consumption of that reinforcer
- Satiation decreases the effectiveness of a reinforcer

New cards

Deprivation

Increase in immediate responding for a reinforcer as a function of the withholding the reinforcer
- Deprivation increases the effectiveness of a reinforcer

New cards

Behavioural Momentum

Different approach to assessing the strength of a reinforced response
Assumes that a stronger reinforced response will be less readily disrupted
Resistance to change depends on association between discriminative stimulus and reinforcer
Example: in presence of green light, pigeons respond on VI 20 s schedule, in presence of a red light pigeons respond on a VI 60 s schedule
Peck at a higher rate and get more reinforcers on green key
Disrupt behaviour by providing free reinforcers
Responding on green key decreases by 60%; responding on red key decreases by 80%
Green key associated with higher rate of reinforcement has greater momentum (less disrupted)

New cards

Contingency vs Rule Governed

Contingency-governed
Rule-governed
Reason for different patterns of responding on FI schedules in humans versus animals

New cards

Contingency-Governed

Pattern of responding on a reinforcement schedule is a function of the reinforcement contingencies

New cards

Rule-Governed

Pattern of responding on a schedule is a function of a rule generated by the human or given by the experimenter in instructions