Schedule Combinations and Behavior Synthesis
Chapter 16: Schedule Combinations: Behavior Synthesis
Schedules don't operate in isolation; they combine in various ways, influencing behavior and shedding light on complex phenomena like choice, procrastination, and self-control. This chapter explores these combinations, emphasizing their relevance to broader behavioral principles rather than their intricacies alone.
Multiple and Mixed Schedules: Observing Responses
Multiple schedules involve alternation between two or more schedules, each signaled by a distinct stimulus. This allows for stimulus control. On the other hand, mixed schedules involve alternating schedules without correlated stimuli.
Example: A pigeon pecks during a green light on an FI schedule and during a red light on a VI schedule.
Multiple schedules are useful baselines for studying variables affecting behavior such as drug effects because drug effects often vary with the schedule that maintains responding.
Observing Responses: Pigeons can be trained to "observe" discriminative stimuli by pecking a key. For example, pecking an observing key changes a yellow key to green during VR (variable ratio) reinforcement and to red during EXT (extinction).
The reinforcing effectiveness of discriminative stimuli depends on their relation to food reinforcers. The pigeon observes because they allow it to behave more efficiently with respect to the component schedules.
An important question revolves around whether observing is maintained because the stimuli are conditional reinforcers or because they are informative.
Observing behavior is maintained when it produces the RFT stimulus but not when it produces just the EXT stimulus. Similarly, stimuli correlated with differential punishment do not maintain observing responses very well.
The reinforcing effectiveness of a discriminative stimulus depends not on its informativeness but rather on the particular consequences with which it's correlated; it works well if it is correlated with good news but not if it is correlated with bad news.
A central problem in discrimination learning may simply be that of getting the organism to observe the relevant stimuli.
The effectiveness of a message depends more on whether its content is reinforcing or aversive than on whether it's correct or complete.
Organisms do not work for information per se.
Stimuli differentially correlated with avoidance or escape contingencies should maintain observing behavior.
Multiple Schedules: Inhibitory Interactions
In multiple schedules, behavior in one component can be affected by what happens in the other.
Behavioral Contrast: If one schedule changes from VI reinforcement to extinction while VI reinforcement continues on another, the response rate decreases during the first stimulus and is accompanied by increased pecking during the second, even though the schedule that operates during the second is unchanged.
Contrast effects vary with reinforcers, responses, and organisms.
These effects have sometimes been interpreted as the summation of two types of pecking: operant pecking and respondent pecking.
Contrast effects are more appropriately treated in terms of interactions among reinforced responses than in terms of side-effects of inhibitory processes during extinction.
Inhibitory interactions can also be seen in the visual system, where the firing of each cell reduces the rate of firing of the other.
Behavioral contrast is an example limited just to two response classes, analogous to graphs 1a through 1d of Figure 16-3.
Effects analogous to graph 3 when a discontinuity between reinforcement and extinction was arranged along a spatial array.
Response classes are strengthened by the reinforcers they produce, but those reinforcers also inhibit other response classes.
Chained, Tandem, and Second-Order Schedules
Chained schedules involve a sequence of schedules, each signaled by a distinct stimulus, leading to a terminal reinforcer. Tandem schedules are similar but lack distinct stimuli for each component. Second-order schedules reinforce the completion of an entire schedule sequence.
Chained schedules have been used extensively to study conditioned or conditional reinforcers.
The feeder light becomes a reinforcer only through its relation to food in the feeder, and the clicker becomes a reinforcer only through its relation to the various reinforcers arranged by the pet owner.
The conditional reinforcing functions of stimuli have something in common with their discriminative functions.
Extended Chains
Adding stimuli to a chain can surprisingly slow down responding. Stimulus changes in chained schedules had some reinforcing effects, but they were mostly restricted to the late components, close to the reinforcer.
Breaking a sequence into distinct units disrupts cohesive behavior.
In chained schedules, a stimulus supports less responding the further it is from the end of the sequence.
These differences occur with various schedules.
These effects depend on a constant ordering of the chained stimuli. The long pauses decrease markedly if the stimulus order changes from one reinforcer to the next.
Relative to tandem schedules, chained schedules of punishment reduce responding mostly in the later components of the chain.
Punishment after a deed is done probably has its greatest effect on the behavior that precedes getting caught and only minimal effects on the much earlier behavior that led up to the misdeed.
Brief Stimuli
Stimuli in chained schedules can become conditional reinforcers, but they combine with discriminative effects in such a way that responding is reduced.
When reinforcement schedules were applied to their analysis, schedules were arranged not only for the production of conditional reinforcers by responses but also for the contingent relation between conditional and primary reinforcers
In second-order schedules, the completion of one schedule is a behavioral unit that is reinforced according to another schedule.
In contrast to chained schedules, second-order schedules with brief stimuli can greatly amplify reinforced responding.
Variables such as the relation between the brief stimuli and primary reinforcers determine the effectiveness of second-order schedules.
Chained schedules may attenuate responding, second-order schedules may amplify it.
Second-order scheduling can also include other kinds of operants, as when correct responses in matching-to-sample are reinforced according to various schedules.
Individual pecks are functional units, but within FR performance the entire ratio may function as a unit. As long as the higher-order class is reinforced, the subclasses within it may also be maintained even though they are no longer reinforced.
Concurrent Schedules: Matching and Maximizing
Concurrent schedules involve two or more schedules operating simultaneously for different responses, allowing for the study of choice and preference.
Variables with small effects in single-response schedules often have large effects in concurrent schedules, which are therefore useful for studying effects of reinforcement variables.
Increases in the reinforcement of one response reduce the response rate of others.
The most general feature of concurrent performances is that increases in the reinforcement of one response reduce the rate of other responses.
If the response rate generated by a given rate of VI reinforcement is independent of how these reinforcers are distributed to the two keys, it follows that increasing the reinforcement of one response will reduce the rate of the other.
Although pigeons distribute their pecks to both keys with concurrent VI VI schedules, the reinforcer may act on both responses.
For this reason, concurrent VI procedures have often incorporated a changeover delay, which prevents a response from being reinforced immediately after a changeover from the other.
With a changeover delay, the pigeon distributes its responses to concurrent VI VI schedules roughly in proportion to the distribution of reinforcers they arrange.
Matching Law: relative responding matches the relative reinforcement produced by that responding.
The matching law summarizes performances in a variety of schedules, but its status as a convenient description or as a fundamental property of behavior rests on whether it can be derived from simpler processes.
As the pigeon pecks one key, time passes during which the VI schedule for the other key may set up a reinforcer. A time will come when the reinforcement probability for changing over to the other key exceeds that for continuing to peck the same key. This shifts from one key to the other as time passes and the pigeon will distribute its responses to both keys in concurrent VI VI schedules.
Maximizing: Emitting the response with the maximum reinforcement probability.
Momentary maximizing at the molecular level may lead to matching at the molar level.
Matching and maximizing may seem contradictory alternatives, but they are measured in different ways.
Concurrent performances can be described as optimization, satisficing, or melioration.
Generalized Matching Law: takes bias and other factors into account and transforms the data to logarithmic rather than linear coordinates.
The characteristic decrease of one response with increases in the reinforcement of another shown in the top data set does not occur with the second data set, where both responses first increase together before their rates begin to change in opposite directions.
Whether generalized matching is a fundamental process that in some way dictates the details of schedule performances or is a derivative of the moment-to-moment responding generated by reinforcement schedules has long been a source of controversy.
Concurrent-Chain Schedules: Preference
Concurrent-chain schedules separate the reinforcing effectiveness of the terminal link from the contingencies that maintain responding in that link.
A dominance of one alternate over others in a sequence of choices is typically called a preference, and concurrent-chain procedures are particularly well-suited for the analysis of preferences.
We judge preferences among situations not by how much behavior they produce but by the relative likelihoods with which an organism enters them.
We must distinguish between response rates and choices when we study preferences.
Concurrent chains have shown that reinforcement rate is a more important determinant of preference than the number of responses per reinforcer and that variable schedules are preferred to fixed schedules.
Studies of preferences among various parameters of reinforcement schedules can be technically complex, because they must control for differences in time or responses per reinforcer in terminal links and for occasional biases toward particular colors or sides.
Self-Control
We usually speak of self-control when we forgo a relatively immediate consequence in favor of a later larger one. Both the immediate and the deferred consequences may be reinforcing or aversive.
An example of behavior synthesis with concurrent-chain schedules is provided by the procedure in Figure 16-9.
Confronted with both red and green in terminal-link A, the pigeon almost invariably pecks red, producing the small immediate reinforcer and not the large delayed one; this has been called impulsiveness.Confronted with only green in terminal-link B, the pigeon necessarily produces the large delayed reinforcer.
When the opportunity to choose was coming up very soon, the pigeon was likely to be impulsive; when it came up later, the pigeon was likely to show self-control.
By producing terminal-link B when T is long, the pigeon commits itself to the large reinforcer even though it wouldn't do so at the onset of green if red were also present and therefore, pecks that produce terminal-link B have been called commitment responses.
Other syntheses can be created with other temporal arrangements. In Figure 16-11, if at any time the pigeon's peck is determined by which gradient has the higher value, then where the A gradient is higher as the pigeon approaches time C, it will commit to reinforcer B, showing self-control. At time C the gradients cross over so the A gradient is now higher; between times C and A it will choose reinforcer A, showing impulsiveness.
With the concurrent-chain procedure we can examine impulsiveness and commitment with immediate and delayed reinforcers or with immediate and delayed aversive stimuli.
These behavior syntheses provide an essential reference performance for the analysis of self-control and illustrate the relevance of reinforcement schedules to human behavior.
Behavior Synthesis: Natural Foraging
Concurrent-chains have been broadly applied to the synthesis of complex behavior. A successful synthesis supports the interpretation; an unsuccessful one may reveal inadequacies in the assumptions about what was going on in the natural setting.
In the field of behavioral ecology, this strategy is illustrated by studies of natural foraging.
In their foraging, animals in the wild travel from one patch of food to another, staying or moving on to new ones depending on what they find.
Progressive schedules have been useful tools because they vary the schedules as availability or other conditions progressively change.
Concurrent-chain schedules in the laboratory that simulate those in natural habitats have revealed some properties of foraging.
Natural foraging may be treated in terms of concurrent-chain schedules, and properties of natural foraging, in turn, may suggest variables that are important in concurrent-chain performances.
A Schedule Taxonomy
Reinforcement schedules are tools that can be applied to the study of a variety of behavioral phenomena relevant to human concerns. The complexity of schedule effects has made schedule analysis highly technical.
Adjusting schedules vary as a function of some property of performance. A schedule in which time and number requirements interact is an interlocking schedule.
Table 16-1 shows the major schedule combinations.
We can test our interpretation of behavior in a natural habitat by trying to assemble its components in a laboratory setting. When we attempt synthesis we probably profit more from our failures than from our successes.
It may be a general principle of scientific research that we learn the most when our experiments produce data we didn't expect.
Addendum 16A: Behavioral Economics
Behavioral economics began when a few behavior analysts recognized that some performances and concepts derived from schedules of reinforcement were relevant to properties of the behavior studied by economists.
Economic behavior should share properties with other varieties of behavior, so it is likely to be profitable to study how this behavior arises out of simpler behavior. Schedules of reinforcement provide useful tools for such an endeavor.
An experimenter can arrange a supply of some reinforcer, such as food, and that the demand for food by an experimental organism can be altered by establishing operations such as deprivation and satiation.
The experimenter can manipulate the cost of food as a commodity by changing response requirements.
Not so obvious is the difference between a situation in which the only food reinforcers available to the organism are those it earns within an experimental session and one in which what it earns is supplemented by additional feeding outside of the session: The former is analogous to what economists called a closed economy and the latter to what they call an open economy.
When the demand for a commodity changes substantially with changes in price, its demand is said to be elastic; to the extent that demand does not change flexibly with changes in price, it is said to be inelastic.
One determinant of elasticity is substitutability.
Discounting: We typically discount the value of delayed consequences relative to immediate ones.
A problem with interpretations of economic behavior in terms of changes in value is that descriptions in such terms may leave out the behavior.
Verbal estimates differ from choices that are products of actual contingencies. Whether verbal mediation is involved or not, we cannot assume that choices such as these or economic behavior in general are based on rational decisions. Instead, we must examine the contingencies.
Addendum 16B: Schedules and Attention Deficit Hyperactivity Disorder
Schedules of reinforcement have lent themselves to a many applications. One example is in the analysis and interpretation of the components of attention-deficit hyperactivity disorder (ADHD).
Delay-of-reinforcement gradients describe how the effects of reinforcers vary as a function of the time separating them from the responses that preceded them.
The differential strengthening of rapid responding takes time, which may also explain why hyperactivity often takes a while to develop and develops separately in different environments.
Hyperactivity is an appropriate name for behavior in which rapid sequences have displaced more leisurely ones.
A child who does not look or listen appropriately might be described as having an attention deficit.
The child for whom the delay gradient is unusually steep will be less able to deal with or tolerate longer delays, which is perhaps why the attention of such children is so easily captured by computer games, which typically provide very rapid feedback.
Consider also the inverse relation between impulsivity and self-control. Forgoing a small immediate reinforcer for a later larger one is described as self-control, but whether this happens depends on the steepness of the delay gradient. The steeper the gradient the less potent the larger reinforcers that follow longer delays, so the greater the likelihood of impulsive behavior.
These three different manifestations of ADHD, hyperactivity, attention deficit and impulsivity, may all be derivatives of a single variable, the steepness of the delay gradient.