Chapter 5: Conditioning Schedules of Reinforcement

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/46

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

47 Terms

1
New cards

Schedules of Reinforcemnt

  • Contingent relations between the response and the consequence

  • Rule that determines when a response will be followed by a reinforcer (includes discriminative stimulus - where)

2
New cards

Continuous Reinforcement (CRF)

  • Each and every response produces a reinforcer

  • Continues until satiation

  • Typically used for acquisition of new behaviours (shaping)

  • Conjugate reinforcement

3
New cards

Conjugate Reinforcement

  • Properties of reinforcement (rate, acquisition, intensity) are tied to particular dimensions of a response

    • More effortful response results in a proportionally larger reinforcement

4
New cards

Response Stereotypy on CRF

  • Topography of response becomes stereotypical on CRF; only see variability with extinction

  • Response variability may vary inversely with rate of reinforcement

  • Organism becomes more variable in responding as reinforcement becomes less frequent or predictable

5
New cards

Intermittent Reinforcement

  • Not every response produces a reinforcer

6
New cards

Advantages of Intermittent Schedules:

  • Less likely to produce satiation

  • Produce higher rates of responding

  • Maintain behaviour for a longer period of time

  • Maintain behaviour longer under extinction

7
New cards

Fixed Ratio Schedule

  • Reinforcement occurs after a fixed number of responses (ratio) (e.g., FR10, every 10th response produces a reinforcer)

  • Produces a pattern of responding known as”Break and RUn” or “Stop and Go”

  • Break and Run

  • Ratio Strain

8
New cards

Break and Run

  • Following reinforcement, a period of no responding (post-reinforcement pause, PRP) followed by a rapid run of responses (Ratio run)

9
New cards

Ratio Strain

  • If ratio requirement increases too rapidly or increment i too large, animals begin to pause for long periods before completing the ratio requirement (e.g., FR10 to FR100 to FR1000)

10
New cards

Explanations for PRP

  • Skinner: occurrence of reinforcement becomes an S^ for non-reinforcement leading animal to pause

  • But… duration of the PRP increases as the ratio requirement increases (inconsistent with Skinner’s explanation)

  • Upcoming ratio requirement plays a greater role in determining PRP

  • On a multiple FR10 FR100 schedule (an FR10 schedule alternates with an FR100 schedule; with FR10 signalled by a red light and FR100 by a blue light)

  • If the PRP is due to fatigue, then should always see a long pause after FR100 and a sort pause after FR10

  • Instead there is a relation between duration of PRP and stimulus signalling the upcoming ratio

  • Pause is longer when the stimulus is blue (predicts FR100) and shorter when the stimulus is red (predicts FR10)

  • Consistent with remaining responses hypothesis: PRP is a period furthest from the next reinforcer and immediately following the previous reinforcer when the tendency to respond is weakest

  • Tendency is weaker when the number of upcoming responses is greater

  • “Pre ratio rather than “post reinforcement” pause

11
New cards

Variable Ratio Schedule

  • Reinforcement occurs after a number of responses; but that number varies from reinforcer (e.g., VR10, average number of responses is 10, but varies between 1 and 40 responses)

  • Response requirement is unpredictable

  • Produces the highest rate of response

  • Very short PRP; occasional occurrence of a reinforcer after a small number of responses (short run) reduces the likelihood of pausing

12
New cards

Fixed Interval Schedule

  • The first response after a fixed interval of time has elapsed produces reinforcement

  • Produces a pattern of responding known as scalloping (also see break and run)

  • Do not confuse with a fixed time (FT) schedule - reinforcement is delivered after a fixed period of time without any response required

  • FI schedules like timetables for trains or buses, but with a limited hold

13
New cards

Scalloping

  • Pause following reinforcement (~1/2 of interval), then a acceleration in responding up to the time when a response produces a reinforcer

14
New cards

Human vs Animal Performance on Fixed Interval Schedule

  • Rats/pigeons show scallop; humans show either steady high rate or low rate break and run

  • Difference: language? - human performance follows self-generated rules

  • Implies: preverbal humans should show characteristic schedule effects

  • Lowe, Beasty, & Bentham (1983) showed that infants that were not verbally skilled produced a scallop pattern on FI schedule

  • Confound: adult humans have greater experience with ratio schedules; history may affect performance on FI schedules

  • Wanchisen, Tatham, & Mooner (1989) exposed rats to VR before FI schedules; rats showed either a high rate pattern or a low rate break and run pattern

15
New cards

Variable interval Schedule

  • First responses after an interval of time has elapsed produces reinforcement; however, the interval of time that must elapse varies from reinforcer to reinforcer (e.g., VI10 s, produces reinforcement on average every 10s; however, the intervals vary between 1 s and 40 s)

  • Reinforcement requirement is unpredictable

  • Produces a steady moderate rate of responding

  • Pause after reinforcement typically does not occur

  • Steady response rate during extinction

  • Frequently used to produce a baseline rate of responding to evaluate the effect of independent variables

  • Moderate rate can increase or decrease

16
New cards

Response-Reinforcer Correlation Theory

  • Molar theory based on feedback functions

  • On a VR schedule, there is a linear relationship between response rate and reinforcement rate

  • Feedback: faster your respond, the quicker the reinforcers come

  • On a VI schedule, as response rate initially increases rate of reinforcement increases until it reaches the point at which each reinforcement is being picked up as soon as it is set up, beyond this point, further increase in response rate produce no changes in reinforcement rate

  • Maximum reinforcement rate determined by the schedule (VI 30 s max 120 reinforcers/hour)

  • Feedback: respond faster, but reinforcers do not occur quicker

17
New cards

Response Rate

  • How quickly you respond

18
New cards

Reinforcement Rate

  • How quickly reinforcers are delivered

19
New cards

IRT Reinforcement Theory

  • Molecular theory

    • On a VI schedule, probability that a response will produce reinforcement increases as the duration of the time between responses [inter response time (IRT)] increases

    • Probability that a response will produce reinforcement is higher for longer IRTs than for shorter IRTs

    • Selectively strengthens long pauses between successive responses

    • On a VR schedule, probability that a response will produce reinforcement is constant and independent of IRT

    • Each response has the same probability of paying off since the schedule does not advance unless a response is made (does not advance with time)

    • Probability = 1/VR

    • Tendency for the animal to respond in bursts my lead to strengthening of short IRTs

20
New cards

Evidence

  • VR+ schedule: molar properties of VR schedule, but molecular properties of VI schedule

  • Humans pressed at high rate on both VR and VI+ schedule (evidence for molar)

  • Rats showed response rates on VI+ schedule similar to VI schedule when matched for reinforcement rate (evidence for molecular)

  • Not “either or”; sensitivity to molar vs molecular depends on response rate

  • Low response rate contacts molecular contingencies related to IRTs while high response rate contacts molar contingencies in terms of the correlation between response and reinforcement rates

21
New cards

Progressive Ratio Schedule

  • Reinforcement occurs after a number of responses (ratio) with the required number of responses increasing systematically after each reinforcement

  • Arithmetic progression

  • Geometric progressions

  • Breakpoint

  • Breakpoint is used as an index of reinforcement efficacy

  • Comparison of breakpoints shows the relative reinforcement efficacy of different drugs as well as different doses on the same drug

22
New cards

Arithmetic Progression

  • Increment is constant (5, 10, 20, 40, 80, 160, 320 responses)

23
New cards

Geometric Progressions

  • Ratio determined by multiplying previous ratio by a fixed number (5, 10, 20, 40, 80, 160, 320 responses)

24
New cards

Breakpoint

  • Highest ratio value completed, before animal failed to complete

25
New cards

PRPs on FI Schedules

  • PRP duration varies with inter reinforcement interval (IRI), as IRI gets longer, PRP gets longer

  • PRP is ~1/2 the IRI

  • PRP on FR also increases as the ratio increases; however, IRI is partly determined by the animal (high response rate - shorter IRI, low response rate - longer IRI)

  • On FR, rate of responses increases as ratio initially increases

26
New cards

Molar Account of PRP

  • Animals seek to maximize overall reinforcement rate (maximum reinforcement for least amount of effort)

  • PRP ~1/2 interval duration; if longer than 1/2, then overall Rae decreases as animal pauses longer than reinforcement interval more often, if shorter than ½ then animal making more responses for same overall reinforcement

27
New cards

Molecular Account of PRP

  1. Animals obtain automatic reinforcement from engaging in other behaviours (e.g., grooming, sniffing, scratching) during the PRP

    1. Maximize total reinforcement from both sources (extrinsic reinforcement from lever pressing and automatic reinforcement from schedule-induced behaviours early in the interval)

  2. High rate of responding for a prior reinforcement reduces the value of a subsequent reinforcement

    1. Reduced value leads to a longer pause before initiating responding

28
New cards

Response Rate Schedules

  • DRL (Differential Reinforcement of a Low Rate of Responding)

  • DRH (Differential Reinforcement of a High Rate of Responding)

29
New cards

DRL (Differential Reinforcement of a Low Rate of Responding)

  • Reinforcement occurs if interval of time between successive responses > X

  • DRL 10 s: every response that occurs 10 seconds after the last response is reinforced; if a response occurs less than 10 s after the last response, no reinforcer occurs

  • Generates low rates of responding; average pause is slightly shorter than the required interval and about 50% of responses go unreinforced

30
New cards

DRH (Differential Reinforcement of a High Rate of Responding)

  • Reinforcement occurs if a certain number of responses occur within a fixed amount of time

  • Example: Reinforcer only occurs if the animal has made 10 responses in 10 s or less

  • Generates high rates of responding

31
New cards

Concurrent Schedules

  • Subject is presented with two or more response alternatives, each associated with its own reinforcement schedule

  • Both schedules operate simultaneously

  • Dependent measure = relative allocation of time and behaviour to each alternative

  • Used to study choice

32
New cards

Chain Schedules

  • Sequence of reinforcement schedules where completion of previous schedule produces opportunity to respond on the next schedule

  • E.g., Chain (White) FI 30 s (Green) FR 10 - Food

  • Transition from one schedule to another is signalled by a change in a stimulus (e.g., light over lever changes white to green)

  • Change in stimulus serves as a conditioned reinforcer than maintains responding in the previous link (terminal reinforcer [food] conveys value to this stimulus)

33
New cards

Delay

  • General rule: the more immediately a reinforcer occurs following a behaviour, the more effective the consequence for strengthening that behaviour [contiguity]

  • Delay decreases the effectiveness of a reinforcer

34
New cards

Reason 1

  • During the delay, other behaviours occur that are also reinforced by the consequence

    • What is reinforced is another behaviour plus the target behaviour rather than just the target behaviour alone. This weakens the strengthening effect of reinforcement on the target behaviour

35
New cards

Reason 2

  • Value of a reinforcing consequence decreases with delay

    • Example: if I offer you the choice between $100 tomorrow or $100 a month from tomorrow, which would you choose?

36
New cards
  • Equation describes the decrease in value as a function of delay is:

  • Value = Amount / 1+K (Delay)

    • K refers to a temporal discounting rate

    • Large value of K means that the rate of temporal discounting is high, that is, value declines rapidly with delay

    • Small values of K means that the rate of temporal discounting is low, that is, value declines slowly with delay

    • K varies across species (pigeons have high values of K, people have low values)

37
New cards

Magnitude or Amount

  • Magnitude of reinforcement effect: the larger the reinforcer, the higher the rate of responding maintained by a reinforcer

  • Exception: Paradoxical Incentive Effect - Bizo et al. showed that rats will respond at higher rates when the reinforcer is 1 food pellet than when it is 2 pellets

  • Propose: larger magnitude leads to less efficient coupling of the consequence to the response

38
New cards

Effectiveness of Reinforcement Magnitude also Depends on:

  • Type of reinforcer

  • Effort

39
New cards

Type of Reinforcer

  • With consumable reinforcers, larger reinforcers will produce satiation more quickly than with non consumable reinforcers (e.g., money)

40
New cards

Effort

  • It takes more reinforcement to maintain a more effortful response than a less effortful response

41
New cards

Motivational Operation

  • Satiation

  • Deprivation

42
New cards

Satiation

  • Decrease in responding for a reinforcer as a function of recent consumption of that reinforcer

    • Satiation decreases the effectiveness of a reinforcer

43
New cards

Deprivation

  • Increase in immediate responding for a reinforcer as a function of the withholding the reinforcer

    • Deprivation increases the effectiveness of a reinforcer

44
New cards

Behavioural Momentum

  • Different approach to assessing the strength of a reinforced response

  • Assumes that a stronger reinforced response will be less readily disrupted

  • Resistance to change depends on association between discriminative stimulus and reinforcer

  • Example: in presence of green light, pigeons respond on VI 20 s schedule, in presence of a red light pigeons respond on a VI 60 s schedule

  • Peck at a higher rate and get more reinforcers on green key

  • Disrupt behaviour by providing free reinforcers

  • Responding on green key decreases by 60%; responding on red key decreases by 80%

  • Green key associated with higher rate of reinforcement has greater momentum (less disrupted)

45
New cards

Contingency vs Rule Governed

  • Contingency-governed

  • Rule-governed

  • Reason for different patterns of responding on FI schedules in humans versus animals

46
New cards

Contingency-Governed

  • Pattern of responding on a reinforcement schedule is a function of the reinforcement contingencies

47
New cards

Rule-Governed

  • Pattern of responding on a schedule is a function of a rule generated by the human or given by the experimenter in instructions