Key Operant Conditioning Concepts to Know for AP Psychology

What You Need to Know

Operant conditioning = learning where behavior changes because of its consequences. You’re “operating” on the environment, and the environment “feeds back” with reinforcement or punishment.

Why it matters on AP Psych: a big chunk of behaviorism questions are really testing whether you can (1) label consequences correctly (+/-, reinforcement vs punishment), (2) predict behavior patterns under different schedules, and (3) apply behavior-modification logic (shaping, extinction, token economies, etc.).

The core rule (the whole unit in one line)

Reinforcement increases the behavior it follows.
Punishment decreases the behavior it follows.
Positive means add, negative means remove (you’re describing what happens to the stimulus, not whether it’s “good” or “bad”).

Key people you’re expected to recognize

B. F. Skinner: operant conditioning, Skinner box/operant chamber, reinforcement schedules.
Edward Thorndike: Law of Effect — behaviors followed by satisfying outcomes become more likely; behaviors followed by unpleasant outcomes become less likely.

Critical reminder: Don’t confuse negative reinforcement with punishment. Negative reinforcement increases behavior by removing something unpleasant.

Operant vs. classical (only the contrast you need)

Classical conditioning: association between two stimuli (automatic/reflexive responses).
Operant conditioning: association between a behavior and its consequence (voluntary/goal-directed behavior).

Step-by-Step Breakdown

How to classify any consequence question (fast, reliable method)

When the prompt says “After the behavior, what happens?” do this:

Circle the target behavior (what action are we trying to increase/decrease?).
Ask: Did the behavior increase or decrease afterward?
- If it increases → reinforcement.
- If it decreases → punishment.
Ask: Was something added or removed after the behavior?
- Added stimulus → positive.
- Removed stimulus → negative.
Combine your answers: positive reinforcement, negative punishment, etc.

Mini worked classifications

You buckle your seatbelt; the beeping stops; you buckle faster next time.
- Behavior increased → reinforcement
- Something removed (beeping) → negative
- Negative reinforcement
You text in class; teacher takes your phone; you text less.
- Behavior decreased → punishment
- Something removed (phone) → negative
- Negative punishment (a.k.a. response cost if it’s removal of a valued item/privilege)
You finish homework; you get $10; you do homework more.
- Increased → reinforcement
- Added stimulus (money) → positive
- Positive reinforcement
You talk back; you get detention; you talk back less.
- Decreased → punishment
- Added stimulus (detention) → positive
- Positive punishment

Behavior modification (how operant conditioning is used on purpose)

If a question asks “How would you change behavior?” think like this:

Define the target behavior (observable + measurable).
Measure baseline (how often it happens now).
Choose a strategy:
- Increase behavior → reinforcement (often start with continuous then thin to partial).
- Decrease behavior → punishment and/or reinforcement of alternatives (preferred).
Pick the reinforcer/punisher (must matter to that person/animal).
Shape if the behavior is complex (reinforce successive approximations).
Set the schedule (FR/VR/FI/VI) and monitor data.
Fade prompts/reinforcement to maintain behavior long-term.
Plan for generalization (behavior happens in new settings) and maintenance.

Key Formulas, Rules & Facts

Reinforcement vs. punishment (the core table)

Term	What it does to behavior	What happens to stimulus	Example cue words	Notes
Positive reinforcement	Increases behavior	Add desirable stimulus	“get,” “earn,” “receive,” “reward”	Works best when immediate + contingent on behavior
Negative reinforcement	Increases behavior	Remove aversive stimulus	“stop,” “avoid,” “escape,” “relief”	Two common types: escape and avoidance
Positive punishment	Decreases behavior	Add aversive stimulus	“spank,” “shock,” “scold,” “extra chores”	Can create fear/avoidance; often suppresses not teaches
Negative punishment	Decreases behavior	Remove desirable stimulus	“take away,” “lose,” “grounded,” “time-out”	Includes response cost and time-out

Types of reinforcers (know these labels)

Reinforcer type	Definition	Examples	Exam angle
Primary (unconditioned)	Naturally reinforcing; no learning needed	food, water, warmth	Works across species; tied to biology
Secondary (conditioned)	Reinforcing because learned association	money, grades, praise	Depends on learning/history
Generalized conditioned	Conditioned reinforcer linked to many primary reinforcers	money, tokens	Powerful because flexible

Escape vs avoidance (classic AP trap)

Term	What it is	Operant label	Example
Escape learning	Behavior ends something unpleasant that’s already happening	Negative reinforcement	Taking aspirin to stop a headache
Avoidance learning	Behavior prevents something unpleasant	Negative reinforcement	Buckling seatbelt to prevent beeping

Shaping, chaining, and stimulus control

Concept	What it means	Quick example	Why AP asks it
Shaping	Reinforce successive approximations toward a target behavior	Reward “sit” closer and closer to correct form	Used when behavior doesn’t occur naturally
Chaining	Break a complex behavior into steps; reinforce links	Teaching handwashing step-by-step	Often confused with shaping
Discriminative stimulus (Sd)	Cue that a response will be reinforced	“OPEN” sign → buying food gets you food	Tests whether you know behavior depends on context
Stimulus discrimination	Responding differently to different stimuli	Dog sits for owner’s hand signal, not random gestures	Opposite of generalization
Stimulus generalization	Responding similarly to similar stimuli	Fear of all large dogs after bite	Can occur in operant too

Extinction (operant version)

Extinction: behavior decreases when reinforcement is withheld.
Extinction burst: temporary increase in frequency/intensity when extinction starts.
After extinction, behavior can return via:
- Spontaneous recovery (after a rest period)
- Renewal (behavior returns in a different context)

Warning: Extinction is not “punishment.” It’s no longer reinforcing the behavior.

Reinforcement schedules (high-yield)

Continuous reinforcement

Reinforce every response.
Fast acquisition, fast extinction.

Partial (intermittent) reinforcement

Reinforce some responses.
Slower acquisition, greater resistance to extinction (partial reinforcement extinction effect).

The 4 core partial schedules

Schedule	Reinforcement rule	Behavior pattern	Classic example
Fixed Ratio (FR)	After a set number of responses	High rate with post-reinforcement pause	“Buy 10 coffees, get 1 free”
Variable Ratio (VR)	After an unpredictable number of responses	Highest, steady responding; very resistant to extinction	Slot machines, gambling
Fixed Interval (FI)	First response after a set time	“Scallop” pattern: slow then fast as time approaches	Checking oven as timer nears
Variable Interval (VI)	First response after varying time	Steady, moderate responding	Checking for texts/emails

Examples & Applications

Example 1: Token economy (real-world operant conditioning)

Scenario: A teacher gives students tokens for turning in homework; tokens can be exchanged for privileges.

Tokens are generalized conditioned reinforcers.
Exchanging tokens for privileges is backup reinforcement.
If tokens are given for each homework at first → continuous reinforcement; later switching to occasional bonuses → partial reinforcement to maintain.

Exam variation: If a student stops doing homework when tokens stop, that’s extinction (and you might see an extinction burst first).

Example 2: Shaping vs chaining

Scenario: Training a rat to press a lever.

Shaping: reinforce turning toward lever → moving toward lever → touching lever → pressing lever.
Not chaining because it’s not a sequence of distinct steps you must complete in order; it’s approximations toward a single response.

Scenario: Teaching a child to tie shoes.

Chaining: make loops → cross → pull through → tighten (reinforce completion of each step/link).

Example 3: Negative reinforcement (avoidance)

Scenario: A student studies early to avoid the stress of cramming.

Stress avoided → aversive stimulus removed/prevented.
Studying increases → negative reinforcement.

Exam variation: If the student studies less after getting grounded, that would be punishment, not negative reinforcement.

Example 4: Schedules in everyday life

Scenario: A worker gets paid every two weeks.

That’s fixed interval (FI) for paycheck timing (time-based). You often see increased effort near payday or evaluation.

Scenario: Refreshing social media to see new posts.

New posts appear unpredictably over time → variable interval (VI).

Common Mistakes & Traps

Mixing up negative reinforcement and punishment
- Wrong: “Negative reinforcement is when something bad happens.”
- Why wrong: “Negative” means remove, and reinforcement means increase behavior.
- Fix: Always ask: did behavior go up (reinforcement) or down (punishment)?
Assuming “reward” always means positive reinforcement
- Wrong: Calling anything pleasant a reinforcer.
- Why wrong: A reinforcer is defined by its effect (behavior increases). Praise that doesn’t increase behavior isn’t a reinforcer.
- Fix: Look for the outcome: “more likely” vs “less likely.”
Forgetting that punishment doesn’t teach the desired behavior
- Wrong: Thinking punishment is the best long-term tool.
- Why wrong: Punishment often suppresses behavior temporarily and can create fear/avoidance; it doesn’t build an alternative.
- Fix: Pair consequences with reinforcement of an incompatible/alternative behavior.
Mislabeling time-out
- Wrong: Calling time-out “negative reinforcement” because it removes the child.
- Why wrong: Time-out removes access to reinforcement (attention, fun), aiming to decrease behavior.
- Fix: Time-out is typically negative punishment.
Confusing discrimination and generalization
- Wrong: “Discrimination means responding to lots of stimuli.”
- Why wrong: That’s generalization.
- Fix: Discrimination = difference; Generalization = general.
Thinking extinction means the behavior disappears immediately
- Wrong: Expecting a smooth decline.
- Why wrong: Extinction often starts with an extinction burst.
- Fix: If you see a spike in behavior after reinforcement stops, that’s classic extinction burst.
Getting schedules mixed up (ratio vs interval; fixed vs variable)
- Wrong: Calling “paid every 10 sales” an interval schedule.
- Why wrong: Sales are responses (ratio), not time.
- Fix: Ratio = responses; Interval = time. Fixed = predictable; Variable = unpredictable.
Assuming continuous reinforcement is most resistant to extinction
- Wrong: “If you reinforce every time, it will last longer.”
- Why wrong: Behaviors learned under continuous reinforcement extinguish faster when reinforcement stops.
- Fix: Remember the partial reinforcement extinction effect.

Memory Aids & Quick Tricks

Trick / mnemonic	What it helps you remember	When to use it
“R = Rise, P = Plunge”	Reinforcement increases; Punishment decreases	Any consequence classification
“Positive = Plus (add), Negative = Minus (remove)”	Positive/negative refer to adding/removing a stimulus	Avoid the “negative = bad” trap
“VR = Vegas Ratio”	Gambling = variable ratio; high, steady responding	Schedule questions
“FI = ‘Finals’ cause scallops”	Fixed interval produces the scalloped curve	Schedule pattern questions
“Ratio = Responses, Interval = In-terval (time)”	Distinguish response-based vs time-based	Schedule ID
“Sd = Signal: reinforcement is available”	Discriminative stimulus cues when behavior pays off	Stimulus control scenarios
“Extinction burst = last-ditch effort”	Behavior may spike before dropping	Extinction graphs/situations

Quick Review Checklist

You can define operant conditioning as behavior shaped by consequences.
You can apply: Reinforcement increases, punishment decreases.
You can correctly label positive vs negative (add vs remove) without mixing it up with good/bad.
You know primary vs secondary vs generalized conditioned reinforcers.
You can explain escape vs avoidance as negative reinforcement.
You can distinguish shaping (successive approximations) from chaining (linked steps).
You can identify Sd, generalization, and discrimination in context.
You know extinction, extinction burst, and why partial reinforcement resists extinction.
You can identify FR/VR/FI/VI and predict their response patterns.

You’ve got this—if you classify consequences by what happens to the behavior first, most operant questions become automatic.