Chapter 8 – Intermittent Reinforcement & Schedules

Definition: Schedule of Reinforcement

A schedule of reinforcement = the rule specifying which occurrences of a target behaviour will be followed by a reinforcer.
- Answers both how and when a response produces a consequence.
- Dichotomy:
- Continuous Reinforcement (CRF) – every response reinforced.
- Intermittent Reinforcement (INT) – only some responses reinforced.

Jan’s Math‐Problem Programme (Applied Illustration)

Initial problem: Jan off-task, many errors.
Intervention (ratio‐thinning procedure):
- FR $2$ → praise every 2 completed problems.
- After stability, raised to FR $4$ , then FR $8$ , then FR $16$ .
Outcomes:
- Work rate ↑ sharply; non-attending dropped to $0\%$ .
- Demonstrates power of INT for maintenance after acquisition.

Continuous vs. Intermittent Reinforcement

CRF (a.k.a. FR 1)
- Examples: faucet gives water, light switch gives light, puddle splashes.
- Functionally the opposite of extinction; behaviour never contacts non-reinforcement.
INT
- Reinforcement only occasionally = "sometimes".
- Extremely resistant to extinction; drives persistent behaviours (e.g.
  slot-machine gambling).
- Ethically neutral: can sustain prosocial habits or addictive/problem actions.

Acquisition → Maintenance Heuristic (Always / Sometimes / Not-Never)

During acquisition (learning phase) – use ALWAYS (CRF).
During maintenance – thin to SOMETIMES (INT).
Avoid NOT-NEVER; stopping reinforcement entirely risks extinction.

Advantages of Intermittent Schedules

Less satiation; capitalises on deprivation.
Greater resistance to extinction.
Produces more consistent response patterns.
Transfers more readily to natural reinforcers (praise, money, smiles).

RATIO Schedules ("How Many")

1. Fixed Ratio (FR n)

Reinforcer after a fixed number $n$ of responses.
Produces: high steady responding, post-reinforcement pause (PRP).
"Higher the FR value ⇒ longer the PRP."
Example: Piecework pay $\text{FR}\,10$ ( $1$ USD per 10 items).

2. Variable Ratio (VR n¯)

Reinforcer after a variable number around a mean $\bar{n}$ .
Little/no PRP; very high rate; most resistant to extinction.
Examples: slot machines, unpredictable praise for math problems.

Post-Reinforcement Pause (PRP)

Brief pause after reinforcement before next response run.
Length governed by schedule value (FR size).
- E.g. Nurse: work 8 h → 16 h off; work 16 h → 32 h off (longer value, longer pause).

Ratio Strain

Sudden, large increase in ratio requirement ⇒ response breakdown.
Example: Daily Pokémon card (FR 1) abruptly changed to every 14 days (FR 14) ⇒ child quits.
Remedy: gradual thinning to avoid “value strain.”

INTERVAL Schedules ("How Long")

3. Fixed Interval (FI t)

First response after a fixed time $t$ reinforced.
"Scallop" pattern: low early, accelerating near interval end; PRP length = interval length.
Example: TV show every Thursday 7 p.m. ⇒ $\text{FI 1 week}$ .

4. Variable Interval (VI t¯)

First response after variable interval (mean $t¯$ ) reinforced; steady, moderate rate; no PRP.
Example: Checking email/Facebook – update after 3 min, 15 min, 6 min ⇒ $\text{VI 8 min}$ .

LIMITED HOLD (LH)

Deadline added: reinforcer available only for a brief window after interval elapses.
Notation: FI 10 min / LH 30 s, FR 20 / LH 3 min, etc.
Powerful motivator → induces urgency.
Illustrations:
- FR 20 burpees within 3 min to earn water (FR 20/LH 3 min).
- Subway every 10 min, doors stay open 30 s (FI 10 min/LH 30 s).
- Police ticket quota: 50 tickets in 30 days (FR 50/LH 30 days).

DURATION Schedules

Require the behaviour to continue for a period, not just occur once.

5. Fixed Duration (FD t)

Behaviour must persist continuously for exactly $t$ to access reinforcement.
Produces PRP.
Example: Paid for working continuously for 1 h (FD 1 h).

6. Variable Duration (VD t¯)

Continuous behaviour for variable period around mean $t¯$ .
No PRP.
Example: Rubbing sticks: 10 min today, 20 min tomorrow; $\text{VD 15 min}$ .

Interval + LH Combinations (Schedules 7 & 8)

FI/LH and VI/LH already illustrated; treated as distinct intermittent types.

Concurrent Schedules & The Matching Law

Concurrent schedules: two or more schedules available simultaneously for different behaviours.
- Example: Holding urine (VD – unknown wait for restroom) vs. drinking beer (FR 1 sip per lift) while in line.
Matching Law: Allocation of responses/time matches relative reinforcement rates.
- Behaviour with richer/denser reinforcement wins when they compete.
- Influenced by schedule type, immediacy, magnitude, and response effort.

Ethical, Philosophical & Practical Reflections

INT can sustain altruistic acts (holding doors), learning persistence, but also gambling addiction or maladaptive tantrums if mis-applied.
Designers must weigh social impact; with great resistance to extinction comes responsibility.

Common Pitfalls

Unintentionally creating INT for problem behaviours (e.g., sporadically giving into tantrums) ⇒ hard-to-extinguish patterns.
Over-thinning → ratio strain → apparent extinction.

Implementation Guidelines

Match schedule to behaviour (ratio for discrete responses, interval/duration for sustained activity).
Choose convenient parameters for observers & setting.
Use objective tools (timers, counters) to detect correct reinforcement moments.
Thin gradually to avoid strain; follow always → sometimes continuum.
Explain “rules of the game” to learner to maximise predictability & cooperation.

Quick Reference: Eight Intermittent Schedules Covered

FR, VR, FI, VI, FD, VD, FI + LH, VI + LH.

Exam Focus Reminders

Define each schedule (ratio/interval/duration, fixed/variable, LH).
Be able to write notation: e.g., $\text{VR 2}$ , $\text{FI 15 min/LH 1 min}$ .
Explain post-reinforcement pause rule (higher value ⇒ longer pause).
Explain ratio strain and limited hold conceptually & with examples.
Discuss advantages of INT and its powerful resistance to extinction.