History of Reinforcement
Tags & Description
History of Reinforcement
A person's exposure to various schedules or contingencies of reinforcement that are no longer in place
Operant Behavior
behavior that's EMITTED & operates on the environment to strengthen/weaken a response
Contingency
a dependent relationship between a response & one or more stimulus classes
The Three-Term Contingency (ABC)
Relationship between environment variables & behavior they control
Antecedent (Sd/discriminative stimulus): sets the occasion for operant behavior
Behavior (R/operant class)
Consequence (Sr/reinforcement): followed by a reinforcing consequence
The Three-Term Contingency Example
Discriminative stimulus (Sd): telephone rings
Operant class (R): answers phone
Reinforcement (Sr): talks to others
Discriminative Stimulus (Sd)
stimulus that's present when a behavior is reinforced (high probability of emitting behavior due to being reinforced in the past) ex. you drive when traffic light is green but not when it turns red
S-delta (SΔ)
stimulus present when a reinforcement is not available --> extinction (low probability of emitting operant) ex. you need to go bathroom but there's an "out of order" sign, this signals the non-availability of relieving (negative reinforcement) yourself in that bathroom
Positive Reinforcement
a stimulus is PRESENTED after a response where that behavior is STRENGTENED/INCREASED
Negative Reinforcement
a stimulus is REMOVED after a response where that behavior is STRENGTENED/INCREASED
Factors that Influence the Effectiveness of Reinforcers
Immediacy: delays make reinforcers less effective
Concurrent Schedules: choose to respond one or more reinforcement schedules simultaneously
Motivating Operations: EO & AO
Establishing Operation (EO) - Motivating Operations
antecedent event that makes makes reinforcer MORE potent & behavior MORE LIKELY to occur
Positive Punishment
behavior occurs when a stimulus is PRESENTED but WEAKENED/DECREASED behavior
Negative Punishment
behavior occurs when a stimulus is REMOVED but WEAKENED/DECREASED behavior
Why is "Reinforcement/Punishment doesn't work" incorrect?
this never happen but possible factors to consider is perhaps:
stimulus is not a reinforcer or no longer one
response-reinforcer contingency was arranged, but not contacted
stimulus contingent on wrong response
stimulus not sufficiently effective to reinforce that response
Premack Principle
Making a behavior engaged in at high levels contingent on a behavior engaged in at low levels (ex. HW is a low-frequency behavior & watching TV is a high-frequency response. You can watch TV if you finish your HW)
Deprivation (an establishing motivating operation)
reduction of access to or intake of a reinforcer that momentarily increases behavior (ex. poverty)
Satiation (an abolishing motivating operation)
the temporary loss of effectiveness of a reinforcer due to its repeated presentation that momentarily decreases behavior (ex. "I love going out for dinner to celebrate, but if I did that everyday I would get sick of it.")
Shaping
the use of reinforcement of successive approximations of desired behavior
differential reinforcement: reinforce 1 behavior & not others
Successive Approximation
behaviors that are increasingly closer to the target response
Behavioral Variability
Refers to the animal’s tendency to emit variations in response form in a given situation. The range of behavioral variation is related to an animal’s capabilities based on genetic endowment, degree of neuroplasticity, and previous interactions with the environment. Behavioral variability in a shaping procedure allows for selection by reinforcing consequences and is analogous to the role of genetic variability in natural selection.
Abolishing Operation (AO) - Motivating Operation
antecedent event that makes reinforcer LESS potent & behavior LESS LIKELY to occur
Extinction
the gradual weakening of a conditioned response resulting in a behavior stopping. procedure is a contingency of reinforcement, & defined as zero probability for operant response due to withdrawal of reinforcement.
How does extinction of positively reinforced behavior differ from extinction of negatively reinforced behavior?
positively reinforced behavior = stimulus no longer DELIVERED following behavior
negatively reinforced behavior = stimulus no longer REMOVED following behavior
Extinction Burst
increase in frequency, duration, or intensity of the unreinforced behavior which causes variability (emotional/aggressive) behaviors
Why does extinction occur more rapidly during continuous reinforcement?
The discrimination between reinforcement & extinction is more rapid on continuous reinforcement (Reinforcing Each Response) than no reinforcement (extinction)
Reinstatement
recovery of behavior when the reinforcer is presented alone after a period of extinction (ex. dog tilts head when sees treat, no treat for extinction, then shows treat again & dog tilts head again)
Renewal
recovery of responding when the animal is removed from the extinction context (ex. shows dog treat at park but doesn't tilt head, take dog home & show treat, tilts his head again)
What does it mean that “the rat is always right?”
when you do an experiment, using a rat, and the rat proves your hypothesis wrong, you must be wrong
Intermittent Reinforcement
not every response is followed by a reinforcer (ex. situationships where a guy is being hot & cold towards you)
Continuous Reinforcement (CFR)
each response is followed by the reinforcer (ex. receiving chocolate every time you wash the dishes)
Difference between a RATIO and INTERVAL schedule of reinforcement
ratio schedules: response based, set to deliver reinforcement following a prescribed # of responses
interval schedules: pay off when one response is made after some amount of time has passed RATIO SCHEDULES PRODUCE A HIGHER RATE OF RESPONSE THAN INTERVAL
Fixed-Ratio (FR) Schedule
reinforcement occurs after a FIXED # of RESPONSES
ex. losing your driver's license after 5 violations
How does an FR5 schedule differ from an FR10 schedule?
A “thicker” schedule would mean decreasing the amount of correct responses needed to earn reinforcement so the amount of reinforcement is increased. (“thicker” = “more” reinforcement) The FR5 is a thicker schedule than an FR10, so the child would now have to get only 5 correct responses before earning reinforcement.
Variable-Ratio (VR) Schedule
reinforcement occurs after a VARYING # of RESPONSES
ex. playing the lottery
strongest reinforcement schedule
How does VR differ from an FR schedule of reinforcement? How does a VR 5 schedule differ from a VR10 schedule?
VR reward is given after VARYING amount of responses, whereas FR is after a FIXED # of responses. A VR5 schedule is just an average of receiving a reward after 5 responses, same idea with VR10 with an average of 10 responses.
Fixed-Interval (FI) Schedule
an EXACT/FIXED amount of TIME passes between each reinforcement
ex. receiving your paycheck every 2 weeks
weakest reinforcement schedule
How does FI differ from an FR schedule of reinforcement? How does an FI 90 seconds differ from an FI 120 seconds?
FI reward is given after a fixed amount of TIME, whereas FR is after a fixed # of RESPONSES. A FI 90s, example: a bar is pressed after 90s results in reinforcement, so FI 120s would have reinforcement after 120s has passed.
Variable-Interval (VI) Schedule
a VARYING amount of TIME passes between each reinforcement
ex. winning a video game
How does VI differ from a FI schedule of reinforcement? How does a VI 90 seconds differ from an VI 120 seconds?
VI rewards is given after a VARYING amount of time, whereas FI is after a FIXED amount of time. The time periods that must pass before reinforcement becomes available will “vary” but must average out at a specific time interval. VI 90s, reward is given after an average of 90s, so VI 120s rewards occurs after an average of 120s
4 Types of Reinforcement Schedules Graph
variable ratio (strongest)
fixed ratio
variable interval
fixed interval (weakest)
Post-Reinforcement Pause (PRP)
The flat part on graph that indicates the pausing after reinforcement. the "break" part of the "break and run" pattern.