Operant conditioning is used to teach behaviors like an elephant walking on hind legs or a child saying "please."
Both classical and operant conditioning are forms of associative learning but differ distinctly.
Classical Conditioning:
Forms associations between stimuli (conditioned stimulus - CS and unconditioned stimulus - US).
Involves respondent behavior, which are automatic responses to a stimulus (e.g., salivation in response to meat powder or a tone).
Operant Conditioning:
Organisms associate their own actions with consequences.
Actions followed by reinforces increase in frequency, while those followed by punishers decrease.
Behavior that operates on the environment to produce rewarding or punishing stimuli is called operant behavior.
B.F. Skinner (1904-1990) was a key figure in modern behaviorism.
Skinner's work elaborated on Edward L. Thorndike's law of effect: Rewarded behavior tends to recur.
In 1943, Skinner and his students successfully taught pigeons how to bowl by shaping their natural behaviors.
Skinner also taught pigeons other behaviors such as walking in a figure eight, playing table tennis, and guiding missiles by pecking at a screen target.
Thorndike used a puzzle box to study cats, rewarding them with fish for escaping.
The cats' performance improved over successive trials, illustrating the law of effect.
Skinner designed an operant chamber, or Skinner box, for his studies.
The box contains a bar or lever that an animal presses to release a reward (food or water) and a device to record responses.
Reinforcement: Any event that strengthens or increases the frequency of a preceding response.
What is reinforcing varies; it can be praise, attention, a paycheck, or drugs.
Food and water work well for hungry and thirsty rats.
Shaping involves gradually guiding an animal's actions toward the desired behavior.
This is done by rewarding successive approximations of the desired behavior.
Example: Training a rat to press a bar by first rewarding it for approaching the bar, then for touching it, and finally for pressing it.
We can also shape our own behavior (e.g., training for a 5k race by gradually increasing walking and running distances and rewarding each stage).
Reinforcers depend on circumstances (e.g., a heat lamp for a cold meerkat).
They also vary among individuals (e.g., chocolate vs. vanilla preference).
Shaping can reveal what nonverbal organisms perceive (e.g., distinguishing colors or tones).
Animals can be trained to form concepts. For example, pigeons can be reinforced for pecking after seeing a human face.
In this case, the human face acts as a discriminative stimulus.
Discriminative Stimulus: A signal that a response will be reinforced.
Pigeons have been trained to discriminate among classes of objects and even between the music of Bach and Stravinsky.
We continually reinforce and shape others' behaviors, often unintentionally.
Example: A child's nagging is reinforced when it leads to a desired outcome, while the parent's response is reinforced by the cessation of nagging.
Pigeons can be trained to spot tumors with similar skill to humans.
Other animals can be shaped to detect landmines or locate people in rubble.
Teachers should use operant conditioning to reinforce gradual improvements in students, rather than only rewarding perfect work.
Positive Reinforcement:
Strengthens responding by presenting a pleasurable stimulus after a response.
Negative Reinforcement:
Strengthens a response by reducing or removing something negative.
Example: Taking aspirin to relieve a headache.
Important to note: Negative reinforcement is NOT punishment. It provides relief from an aversive event.
Sometimes, negative and positive reinforcement coincide (e.g., studying harder to reduce anxiety and get a better grade).
Primary Reinforcers:
Innate and unlearned (e.g., food when hungry).
Conditioned Reinforcers:
Also called secondary reinforcers.
Get their power through learned association with primary reinforcers (e.g., a light signaling food delivery).
Examples: money, good grades, approving words, social media likes.
Immediate reinforcement is more effective for learning.
Delays longer than 30 seconds can prevent learning in rats.
Humans can respond to delayed reinforces (e.g., paychecks, grades).
The ability to delay gratification is important for long-term success.
Children who can delay gratification tend to become more socially competent and high-achieving adults.
Learning occurs rapidly, making it ideal for mastering a behavior.
Extinction also occurs rapidly when reinforcement stops.
Responses are sometimes reinforced, sometimes not.
Learning is slower, but resistance to extinction is greater.
Four types of partial reinforcement schedules:
Fixed-Ratio Schedules
Reinforce behavior after a set number of responses (e.g., free coffee after 10 purchases).
Animals will pause briefly after reinforcement before resuming a high rate of responding.
Variable-Ratio Schedules
Provide reinforces after an unpredictable number of responses (e.g., slot machines, fishing).
Produce high rates of responding because reinforcement is unpredictable.
Fixed-Interval Schedules
Reinforce a response after a fixed time period.
Animals respond more frequently as the anticipated time for reward nears.
Variable-Interval Schedules
Reinforce the first response after varying time intervals.
Produce slow, steady responding because the waiting time is unpredictable.
Physical punishment may increase aggression by modeling violence.
Many psychologists encourage time-outs from positive reinforcement.
Effective time-outs involve clear expectations for alternative positive behaviors.
Focus on positive incentives and reinforcement rather than threats of punishment.
Provide feedback that emphasizes successes rather than failures.
Reinforcement tells you what to do, while punishment tells you what not to do.
Punishment can teach how to avoid it.
Focus on praising what people do right.
Correlation measures the extent to which two factors vary together.
Correlation research involves observation and measurement, without manipulation of variables.
Correlation does not allow for cause-and-effect conclusions.
The correlation coefficient ranges from -1.00 to +1.00.
Positive correlation: Variables move in the same direction.
Negative correlation: Variables move in opposite directions.
Skinner insisted that external influences shape behavior, not internal thoughts and feelings.
He advocated using operant conditioning principles to influence behavior in various settings.
Critics argued that Skinner dehumanized people by neglecting personal freedom and seeking to control actions.
Skinner countered that external consequences already control behavior and that reinforcement is more humane than punishment.
Used to help people with challenges from moderating high blood pressure to gaining social skills.
Machines and textbooks can shape learning in small steps by immediately reinforcing correct responses.
Online adaptive quizzing provides immediate feedback and personalized study plans.
Reinforce small successes and gradually increase the challenge.
Used to create AI programs that mimic human learning.
Reward specific, achievable behaviors immediately.
Effective managers praise good work.
Parents can reinforce good behavior by giving attention and other reinforces when children are behaving well.
Avoid yelling or hitting; instead, explain misbehavior and remove privileges.
Use operant conditioning to reinforce desired behaviors and extinguish undesired ones. Steps include:
State a realistic goal in measurable terms.
Decide how, when, and where you will work toward your goal.
Monitor how often you engage in your desired behavior.
Reinforce the desired behavior with immediate rewards.
Reduce the rewards gradually as the new behavior becomes habitual.
Both are forms of associative learning and involve acquisition, extinction, spontaneous recovery, generalization, and discrimination.
They differ in how associations are formed and the nature of the responses.
Classical Conditioning: Learning associations between events we do not control; involves involuntary, automatic responses.
Operant Conditioning: Learning associations between our behavior and its consequences; involves voluntary behaviors that operate on the environment.