1/46
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Schedules of Reinforcemnt
Contingent relations between the response and the consequence
Rule that determines when a response will be followed by a reinforcer (includes discriminative stimulus - where)
Continuous Reinforcement (CRF)
Each and every response produces a reinforcer
Continues until satiation
Typically used for acquisition of new behaviours (shaping)
Conjugate reinforcement
Conjugate Reinforcement
Properties of reinforcement (rate, acquisition, intensity) are tied to particular dimensions of a response
More effortful response results in a proportionally larger reinforcement
Response Stereotypy on CRF
Topography of response becomes stereotypical on CRF; only see variability with extinction
Response variability may vary inversely with rate of reinforcement
Organism becomes more variable in responding as reinforcement becomes less frequent or predictable
Intermittent Reinforcement
Not every response produces a reinforcer
Advantages of Intermittent Schedules:
Less likely to produce satiation
Produce higher rates of responding
Maintain behaviour for a longer period of time
Maintain behaviour longer under extinction
Fixed Ratio Schedule
Reinforcement occurs after a fixed number of responses (ratio) (e.g., FR10, every 10th response produces a reinforcer)
Produces a pattern of responding known as”Break and RUn” or “Stop and Go”
Break and Run
Ratio Strain
Break and Run
Following reinforcement, a period of no responding (post-reinforcement pause, PRP) followed by a rapid run of responses (Ratio run)
Ratio Strain
If ratio requirement increases too rapidly or increment i too large, animals begin to pause for long periods before completing the ratio requirement (e.g., FR10 to FR100 to FR1000)
Explanations for PRP
Skinner: occurrence of reinforcement becomes an S^ for non-reinforcement leading animal to pause
But… duration of the PRP increases as the ratio requirement increases (inconsistent with Skinner’s explanation)
Upcoming ratio requirement plays a greater role in determining PRP
On a multiple FR10 FR100 schedule (an FR10 schedule alternates with an FR100 schedule; with FR10 signalled by a red light and FR100 by a blue light)
If the PRP is due to fatigue, then should always see a long pause after FR100 and a sort pause after FR10
Instead there is a relation between duration of PRP and stimulus signalling the upcoming ratio
Pause is longer when the stimulus is blue (predicts FR100) and shorter when the stimulus is red (predicts FR10)
Consistent with remaining responses hypothesis: PRP is a period furthest from the next reinforcer and immediately following the previous reinforcer when the tendency to respond is weakest
Tendency is weaker when the number of upcoming responses is greater
“Pre ratio rather than “post reinforcement” pause
Variable Ratio Schedule
Reinforcement occurs after a number of responses; but that number varies from reinforcer (e.g., VR10, average number of responses is 10, but varies between 1 and 40 responses)
Response requirement is unpredictable
Produces the highest rate of response
Very short PRP; occasional occurrence of a reinforcer after a small number of responses (short run) reduces the likelihood of pausing
Fixed Interval Schedule
The first response after a fixed interval of time has elapsed produces reinforcement
Produces a pattern of responding known as scalloping (also see break and run)
Do not confuse with a fixed time (FT) schedule - reinforcement is delivered after a fixed period of time without any response required
FI schedules like timetables for trains or buses, but with a limited hold
Scalloping
Pause following reinforcement (~1/2 of interval), then a acceleration in responding up to the time when a response produces a reinforcer
Human vs Animal Performance on Fixed Interval Schedule
Rats/pigeons show scallop; humans show either steady high rate or low rate break and run
Difference: language? - human performance follows self-generated rules
Implies: preverbal humans should show characteristic schedule effects
Lowe, Beasty, & Bentham (1983) showed that infants that were not verbally skilled produced a scallop pattern on FI schedule
Confound: adult humans have greater experience with ratio schedules; history may affect performance on FI schedules
Wanchisen, Tatham, & Mooner (1989) exposed rats to VR before FI schedules; rats showed either a high rate pattern or a low rate break and run pattern
Variable interval Schedule
First responses after an interval of time has elapsed produces reinforcement; however, the interval of time that must elapse varies from reinforcer to reinforcer (e.g., VI10 s, produces reinforcement on average every 10s; however, the intervals vary between 1 s and 40 s)
Reinforcement requirement is unpredictable
Produces a steady moderate rate of responding
Pause after reinforcement typically does not occur
Steady response rate during extinction
Frequently used to produce a baseline rate of responding to evaluate the effect of independent variables
Moderate rate can increase or decrease
Response-Reinforcer Correlation Theory
Molar theory based on feedback functions
On a VR schedule, there is a linear relationship between response rate and reinforcement rate
Feedback: faster your respond, the quicker the reinforcers come
On a VI schedule, as response rate initially increases rate of reinforcement increases until it reaches the point at which each reinforcement is being picked up as soon as it is set up, beyond this point, further increase in response rate produce no changes in reinforcement rate
Maximum reinforcement rate determined by the schedule (VI 30 s max 120 reinforcers/hour)
Feedback: respond faster, but reinforcers do not occur quicker
Response Rate
How quickly you respond
Reinforcement Rate
How quickly reinforcers are delivered
IRT Reinforcement Theory
Molecular theory
On a VI schedule, probability that a response will produce reinforcement increases as the duration of the time between responses [inter response time (IRT)] increases
Probability that a response will produce reinforcement is higher for longer IRTs than for shorter IRTs
Selectively strengthens long pauses between successive responses
On a VR schedule, probability that a response will produce reinforcement is constant and independent of IRT
Each response has the same probability of paying off since the schedule does not advance unless a response is made (does not advance with time)
Probability = 1/VR
Tendency for the animal to respond in bursts my lead to strengthening of short IRTs
Evidence
VR+ schedule: molar properties of VR schedule, but molecular properties of VI schedule
Humans pressed at high rate on both VR and VI+ schedule (evidence for molar)
Rats showed response rates on VI+ schedule similar to VI schedule when matched for reinforcement rate (evidence for molecular)
Not “either or”; sensitivity to molar vs molecular depends on response rate
Low response rate contacts molecular contingencies related to IRTs while high response rate contacts molar contingencies in terms of the correlation between response and reinforcement rates
Progressive Ratio Schedule
Reinforcement occurs after a number of responses (ratio) with the required number of responses increasing systematically after each reinforcement
Arithmetic progression
Geometric progressions
Breakpoint
Breakpoint is used as an index of reinforcement efficacy
Comparison of breakpoints shows the relative reinforcement efficacy of different drugs as well as different doses on the same drug
Arithmetic Progression
Increment is constant (5, 10, 20, 40, 80, 160, 320 responses)
Geometric Progressions
Ratio determined by multiplying previous ratio by a fixed number (5, 10, 20, 40, 80, 160, 320 responses)
Breakpoint
Highest ratio value completed, before animal failed to complete
PRPs on FI Schedules
PRP duration varies with inter reinforcement interval (IRI), as IRI gets longer, PRP gets longer
PRP is ~1/2 the IRI
PRP on FR also increases as the ratio increases; however, IRI is partly determined by the animal (high response rate - shorter IRI, low response rate - longer IRI)
On FR, rate of responses increases as ratio initially increases
Molar Account of PRP
Animals seek to maximize overall reinforcement rate (maximum reinforcement for least amount of effort)
PRP ~1/2 interval duration; if longer than 1/2, then overall Rae decreases as animal pauses longer than reinforcement interval more often, if shorter than ½ then animal making more responses for same overall reinforcement
Molecular Account of PRP
Animals obtain automatic reinforcement from engaging in other behaviours (e.g., grooming, sniffing, scratching) during the PRP
Maximize total reinforcement from both sources (extrinsic reinforcement from lever pressing and automatic reinforcement from schedule-induced behaviours early in the interval)
High rate of responding for a prior reinforcement reduces the value of a subsequent reinforcement
Reduced value leads to a longer pause before initiating responding
Response Rate Schedules
DRL (Differential Reinforcement of a Low Rate of Responding)
DRH (Differential Reinforcement of a High Rate of Responding)
DRL (Differential Reinforcement of a Low Rate of Responding)
Reinforcement occurs if interval of time between successive responses > X
DRL 10 s: every response that occurs 10 seconds after the last response is reinforced; if a response occurs less than 10 s after the last response, no reinforcer occurs
Generates low rates of responding; average pause is slightly shorter than the required interval and about 50% of responses go unreinforced
DRH (Differential Reinforcement of a High Rate of Responding)
Reinforcement occurs if a certain number of responses occur within a fixed amount of time
Example: Reinforcer only occurs if the animal has made 10 responses in 10 s or less
Generates high rates of responding
Concurrent Schedules
Subject is presented with two or more response alternatives, each associated with its own reinforcement schedule
Both schedules operate simultaneously
Dependent measure = relative allocation of time and behaviour to each alternative
Used to study choice
Chain Schedules
Sequence of reinforcement schedules where completion of previous schedule produces opportunity to respond on the next schedule
E.g., Chain (White) FI 30 s (Green) FR 10 - Food
Transition from one schedule to another is signalled by a change in a stimulus (e.g., light over lever changes white to green)
Change in stimulus serves as a conditioned reinforcer than maintains responding in the previous link (terminal reinforcer [food] conveys value to this stimulus)
Delay
General rule: the more immediately a reinforcer occurs following a behaviour, the more effective the consequence for strengthening that behaviour [contiguity]
Delay decreases the effectiveness of a reinforcer
Reason 1
During the delay, other behaviours occur that are also reinforced by the consequence
What is reinforced is another behaviour plus the target behaviour rather than just the target behaviour alone. This weakens the strengthening effect of reinforcement on the target behaviour
Reason 2
Value of a reinforcing consequence decreases with delay
Example: if I offer you the choice between $100 tomorrow or $100 a month from tomorrow, which would you choose?
Equation describes the decrease in value as a function of delay is:
Value = Amount / 1+K (Delay)
K refers to a temporal discounting rate
Large value of K means that the rate of temporal discounting is high, that is, value declines rapidly with delay
Small values of K means that the rate of temporal discounting is low, that is, value declines slowly with delay
K varies across species (pigeons have high values of K, people have low values)
Magnitude or Amount
Magnitude of reinforcement effect: the larger the reinforcer, the higher the rate of responding maintained by a reinforcer
Exception: Paradoxical Incentive Effect - Bizo et al. showed that rats will respond at higher rates when the reinforcer is 1 food pellet than when it is 2 pellets
Propose: larger magnitude leads to less efficient coupling of the consequence to the response
Effectiveness of Reinforcement Magnitude also Depends on:
Type of reinforcer
Effort
Type of Reinforcer
With consumable reinforcers, larger reinforcers will produce satiation more quickly than with non consumable reinforcers (e.g., money)
Effort
It takes more reinforcement to maintain a more effortful response than a less effortful response
Motivational Operation
Satiation
Deprivation
Satiation
Decrease in responding for a reinforcer as a function of recent consumption of that reinforcer
Satiation decreases the effectiveness of a reinforcer
Deprivation
Increase in immediate responding for a reinforcer as a function of the withholding the reinforcer
Deprivation increases the effectiveness of a reinforcer
Behavioural Momentum
Different approach to assessing the strength of a reinforced response
Assumes that a stronger reinforced response will be less readily disrupted
Resistance to change depends on association between discriminative stimulus and reinforcer
Example: in presence of green light, pigeons respond on VI 20 s schedule, in presence of a red light pigeons respond on a VI 60 s schedule
Peck at a higher rate and get more reinforcers on green key
Disrupt behaviour by providing free reinforcers
Responding on green key decreases by 60%; responding on red key decreases by 80%
Green key associated with higher rate of reinforcement has greater momentum (less disrupted)
Contingency vs Rule Governed
Contingency-governed
Rule-governed
Reason for different patterns of responding on FI schedules in humans versus animals
Contingency-Governed
Pattern of responding on a reinforcement schedule is a function of the reinforcement contingencies
Rule-Governed
Pattern of responding on a schedule is a function of a rule generated by the human or given by the experimenter in instructions