Key Operant Conditioning Concepts to Know for AP Psychology
What You Need to Know
Operant conditioning = learning where behavior changes because of its consequences. You’re “operating” on the environment, and the environment “feeds back” with reinforcement or punishment.
Why it matters on AP Psych: a big chunk of behaviorism questions are really testing whether you can (1) label consequences correctly (+/-, reinforcement vs punishment), (2) predict behavior patterns under different schedules, and (3) apply behavior-modification logic (shaping, extinction, token economies, etc.).
The core rule (the whole unit in one line)
- Reinforcement increases the behavior it follows.
- Punishment decreases the behavior it follows.
- Positive means add, negative means remove (you’re describing what happens to the stimulus, not whether it’s “good” or “bad”).
Key people you’re expected to recognize
- B. F. Skinner: operant conditioning, Skinner box/operant chamber, reinforcement schedules.
- Edward Thorndike: Law of Effect — behaviors followed by satisfying outcomes become more likely; behaviors followed by unpleasant outcomes become less likely.
Critical reminder: Don’t confuse negative reinforcement with punishment. Negative reinforcement increases behavior by removing something unpleasant.
Operant vs. classical (only the contrast you need)
- Classical conditioning: association between two stimuli (automatic/reflexive responses).
- Operant conditioning: association between a behavior and its consequence (voluntary/goal-directed behavior).
Step-by-Step Breakdown
How to classify any consequence question (fast, reliable method)
When the prompt says “After the behavior, what happens?” do this:
- Circle the target behavior (what action are we trying to increase/decrease?).
- Ask: Did the behavior increase or decrease afterward?
- If it increases → reinforcement.
- If it decreases → punishment.
- Ask: Was something added or removed after the behavior?
- Added stimulus → positive.
- Removed stimulus → negative.
- Combine your answers: positive reinforcement, negative punishment, etc.
Mini worked classifications
You buckle your seatbelt; the beeping stops; you buckle faster next time.
- Behavior increased → reinforcement
- Something removed (beeping) → negative
- Negative reinforcement
You text in class; teacher takes your phone; you text less.
- Behavior decreased → punishment
- Something removed (phone) → negative
- Negative punishment (a.k.a. response cost if it’s removal of a valued item/privilege)
You finish homework; you get $10; you do homework more.
- Increased → reinforcement
- Added stimulus (money) → positive
- Positive reinforcement
You talk back; you get detention; you talk back less.
- Decreased → punishment
- Added stimulus (detention) → positive
- Positive punishment
Behavior modification (how operant conditioning is used on purpose)
If a question asks “How would you change behavior?” think like this:
- Define the target behavior (observable + measurable).
- Measure baseline (how often it happens now).
- Choose a strategy:
- Increase behavior → reinforcement (often start with continuous then thin to partial).
- Decrease behavior → punishment and/or reinforcement of alternatives (preferred).
- Pick the reinforcer/punisher (must matter to that person/animal).
- Shape if the behavior is complex (reinforce successive approximations).
- Set the schedule (FR/VR/FI/VI) and monitor data.
- Fade prompts/reinforcement to maintain behavior long-term.
- Plan for generalization (behavior happens in new settings) and maintenance.
Key Formulas, Rules & Facts
Reinforcement vs. punishment (the core table)
| Term | What it does to behavior | What happens to stimulus | Example cue words | Notes |
|---|---|---|---|---|
| Positive reinforcement | Increases behavior | Add desirable stimulus | “get,” “earn,” “receive,” “reward” | Works best when immediate + contingent on behavior |
| Negative reinforcement | Increases behavior | Remove aversive stimulus | “stop,” “avoid,” “escape,” “relief” | Two common types: escape and avoidance |
| Positive punishment | Decreases behavior | Add aversive stimulus | “spank,” “shock,” “scold,” “extra chores” | Can create fear/avoidance; often suppresses not teaches |
| Negative punishment | Decreases behavior | Remove desirable stimulus | “take away,” “lose,” “grounded,” “time-out” | Includes response cost and time-out |
Types of reinforcers (know these labels)
| Reinforcer type | Definition | Examples | Exam angle |
|---|---|---|---|
| Primary (unconditioned) | Naturally reinforcing; no learning needed | food, water, warmth | Works across species; tied to biology |
| Secondary (conditioned) | Reinforcing because learned association | money, grades, praise | Depends on learning/history |
| Generalized conditioned | Conditioned reinforcer linked to many primary reinforcers | money, tokens | Powerful because flexible |
Escape vs avoidance (classic AP trap)
| Term | What it is | Operant label | Example |
|---|---|---|---|
| Escape learning | Behavior ends something unpleasant that’s already happening | Negative reinforcement | Taking aspirin to stop a headache |
| Avoidance learning | Behavior prevents something unpleasant | Negative reinforcement | Buckling seatbelt to prevent beeping |
Shaping, chaining, and stimulus control
| Concept | What it means | Quick example | Why AP asks it |
|---|---|---|---|
| Shaping | Reinforce successive approximations toward a target behavior | Reward “sit” closer and closer to correct form | Used when behavior doesn’t occur naturally |
| Chaining | Break a complex behavior into steps; reinforce links | Teaching handwashing step-by-step | Often confused with shaping |
| Discriminative stimulus (Sd) | Cue that a response will be reinforced | “OPEN” sign → buying food gets you food | Tests whether you know behavior depends on context |
| Stimulus discrimination | Responding differently to different stimuli | Dog sits for owner’s hand signal, not random gestures | Opposite of generalization |
| Stimulus generalization | Responding similarly to similar stimuli | Fear of all large dogs after bite | Can occur in operant too |
Extinction (operant version)
- Extinction: behavior decreases when reinforcement is withheld.
- Extinction burst: temporary increase in frequency/intensity when extinction starts.
- After extinction, behavior can return via:
- Spontaneous recovery (after a rest period)
- Renewal (behavior returns in a different context)
Warning: Extinction is not “punishment.” It’s no longer reinforcing the behavior.
Reinforcement schedules (high-yield)
Continuous reinforcement
- Reinforce every response.
- Fast acquisition, fast extinction.
Partial (intermittent) reinforcement
- Reinforce some responses.
- Slower acquisition, greater resistance to extinction (partial reinforcement extinction effect).
The 4 core partial schedules
| Schedule | Reinforcement rule | Behavior pattern | Classic example |
|---|---|---|---|
| Fixed Ratio (FR) | After a set number of responses | High rate with post-reinforcement pause | “Buy 10 coffees, get 1 free” |
| Variable Ratio (VR) | After an unpredictable number of responses | Highest, steady responding; very resistant to extinction | Slot machines, gambling |
| Fixed Interval (FI) | First response after a set time | “Scallop” pattern: slow then fast as time approaches | Checking oven as timer nears |
| Variable Interval (VI) | First response after varying time | Steady, moderate responding | Checking for texts/emails |
Examples & Applications
Example 1: Token economy (real-world operant conditioning)
Scenario: A teacher gives students tokens for turning in homework; tokens can be exchanged for privileges.
- Tokens are generalized conditioned reinforcers.
- Exchanging tokens for privileges is backup reinforcement.
- If tokens are given for each homework at first → continuous reinforcement; later switching to occasional bonuses → partial reinforcement to maintain.
Exam variation: If a student stops doing homework when tokens stop, that’s extinction (and you might see an extinction burst first).
Example 2: Shaping vs chaining
Scenario: Training a rat to press a lever.
- Shaping: reinforce turning toward lever → moving toward lever → touching lever → pressing lever.
- Not chaining because it’s not a sequence of distinct steps you must complete in order; it’s approximations toward a single response.
Scenario: Teaching a child to tie shoes.
- Chaining: make loops → cross → pull through → tighten (reinforce completion of each step/link).
Example 3: Negative reinforcement (avoidance)
Scenario: A student studies early to avoid the stress of cramming.
- Stress avoided → aversive stimulus removed/prevented.
- Studying increases → negative reinforcement.
Exam variation: If the student studies less after getting grounded, that would be punishment, not negative reinforcement.
Example 4: Schedules in everyday life
Scenario: A worker gets paid every two weeks.
- That’s fixed interval (FI) for paycheck timing (time-based). You often see increased effort near payday or evaluation.
Scenario: Refreshing social media to see new posts.
- New posts appear unpredictably over time → variable interval (VI).
Common Mistakes & Traps
Mixing up negative reinforcement and punishment
- Wrong: “Negative reinforcement is when something bad happens.”
- Why wrong: “Negative” means remove, and reinforcement means increase behavior.
- Fix: Always ask: did behavior go up (reinforcement) or down (punishment)?
Assuming “reward” always means positive reinforcement
- Wrong: Calling anything pleasant a reinforcer.
- Why wrong: A reinforcer is defined by its effect (behavior increases). Praise that doesn’t increase behavior isn’t a reinforcer.
- Fix: Look for the outcome: “more likely” vs “less likely.”
Forgetting that punishment doesn’t teach the desired behavior
- Wrong: Thinking punishment is the best long-term tool.
- Why wrong: Punishment often suppresses behavior temporarily and can create fear/avoidance; it doesn’t build an alternative.
- Fix: Pair consequences with reinforcement of an incompatible/alternative behavior.
Mislabeling time-out
- Wrong: Calling time-out “negative reinforcement” because it removes the child.
- Why wrong: Time-out removes access to reinforcement (attention, fun), aiming to decrease behavior.
- Fix: Time-out is typically negative punishment.
Confusing discrimination and generalization
- Wrong: “Discrimination means responding to lots of stimuli.”
- Why wrong: That’s generalization.
- Fix: Discrimination = difference; Generalization = general.
Thinking extinction means the behavior disappears immediately
- Wrong: Expecting a smooth decline.
- Why wrong: Extinction often starts with an extinction burst.
- Fix: If you see a spike in behavior after reinforcement stops, that’s classic extinction burst.
Getting schedules mixed up (ratio vs interval; fixed vs variable)
- Wrong: Calling “paid every 10 sales” an interval schedule.
- Why wrong: Sales are responses (ratio), not time.
- Fix: Ratio = responses; Interval = time. Fixed = predictable; Variable = unpredictable.
Assuming continuous reinforcement is most resistant to extinction
- Wrong: “If you reinforce every time, it will last longer.”
- Why wrong: Behaviors learned under continuous reinforcement extinguish faster when reinforcement stops.
- Fix: Remember the partial reinforcement extinction effect.
Memory Aids & Quick Tricks
| Trick / mnemonic | What it helps you remember | When to use it |
|---|---|---|
| “R = Rise, P = Plunge” | Reinforcement increases; Punishment decreases | Any consequence classification |
| “Positive = Plus (add), Negative = Minus (remove)” | Positive/negative refer to adding/removing a stimulus | Avoid the “negative = bad” trap |
| “VR = Vegas Ratio” | Gambling = variable ratio; high, steady responding | Schedule questions |
| “FI = ‘Finals’ cause scallops” | Fixed interval produces the scalloped curve | Schedule pattern questions |
| “Ratio = Responses, Interval = In-terval (time)” | Distinguish response-based vs time-based | Schedule ID |
| “Sd = Signal: reinforcement is available” | Discriminative stimulus cues when behavior pays off | Stimulus control scenarios |
| “Extinction burst = last-ditch effort” | Behavior may spike before dropping | Extinction graphs/situations |
Quick Review Checklist
- You can define operant conditioning as behavior shaped by consequences.
- You can apply: Reinforcement increases, punishment decreases.
- You can correctly label positive vs negative (add vs remove) without mixing it up with good/bad.
- You know primary vs secondary vs generalized conditioned reinforcers.
- You can explain escape vs avoidance as negative reinforcement.
- You can distinguish shaping (successive approximations) from chaining (linked steps).
- You can identify Sd, generalization, and discrimination in context.
- You know extinction, extinction burst, and why partial reinforcement resists extinction.
- You can identify FR/VR/FI/VI and predict their response patterns.
You’ve got this—if you classify consequences by what happens to the behavior first, most operant questions become automatic.