FI

Learning Theory – Key Vocabulary

Historical Context

  • Turn of 20th century: two biological traditions gave rise to empirical learning theory.

    • Reflex tradition → Ivan Pavlov → classical (Pavlovian) conditioning.

    • Famous bell–food pairing ➔ dog salivates to bell.

    • Provided a method for analyzing associative mechanisms.

    • Comparative-psychology tradition → Edward Thorndike → instrumental/operant conditioning.

    • Puzzle-box experiments with cats: latch-opening → escape + food.

    • Led to focus on consequences of behavior.

    • 1920s: John B. Watson demonstrates conditioned emotion in humans ("Little Albert" rat–noise → fear generalization to rabbit, dog, fur coat).

    • 1930s: B. F. Skinner refines Thorndike → free-operant lever-press in rats; establishes operant chamber.

    • 1950-60s: Behavior-therapy movement (e.g., Wolpe 1958) + applied behavior analysis; analogy to bench-to-bedside model in medicine.

    • 1970-present: Broader cognitive & evolutionary perspectives; integration with social psychology; cognitive–behavior therapy (CBT).

Why Learning Theory Matters Clinically

  • Principles of learning constantly operate—"laws of learning are always in effect" (like gravity).

  • Learning processes contribute to:

    • Etiology & maintenance of psychiatric disorders (e.g., phobias, substance use, compulsions, maladaptive eating).

    • Mechanisms of therapeutic change (exposure, skills training, medication compliance).

  • Even medication management involves learning: acquiring knowledge of effects/side-effects, adherence routines, overcoming resistance.

Basic Concepts

  • Two foundational forms:

    • Pavlovian (classical) conditioning.

    • Operant (instrumental) conditioning.

  • Key Pavlovian terminology:

    • US (Unconditional Stimulus)

    • CS (Conditional Stimulus)

    • UR (Unconditional Response)

    • CR (Conditional Response)

  • Key Operant terminology:

    • Operant (response R)

    • Reinforcer (O; event increasing R)

    • Positive/negative reinforcement & punishment (Fig 3.3-1 logic).

  • Shared variables influencing both forms:

    • Magnitude & immediacy of US/reinforcer.

    • Extinction when US/reinforcer removed.

    • Contextual control & relapse phenomena.

Pavlovian Conditioning

Effects on Behaviour (Fig 3.3-2 Sign-Tracking)

  • CSs can elicit complex response “systems” (not rigid reflexes):

    • Preparatory physiology: gastric acid, insulin, arousal, temperature rise.

    • Behavioral approach/withdrawal depending on predicted valence.

    • Motivation of ongoing operants (Pavlovian-Instrumental Transfer).

  • Eating-related learning:

    • Flavor–nutrient preferences; flavor-illness aversion (chemo patients, alcohol sickness).

    • External cues trigger craving/overeating.

  • Drug-related learning:

    • Drug = US; paraphernalia/rooms = CSs.

    • Compensatory CRs opposite to drug UR (morphine pain ↑, alcohol temp ↑) ⇒ tolerance & overdose risk in novel contexts.

    • Unpleasant compensatory CRs yield negative reinforcement (“escape” drug-taking).

    • Not universal—cocaine often produces drug-like CR.

  • Anxiety/Fear:

    • CSs paired with shock evoke cardio-resp changes, analgesia, blinks, potentiated startle.

    • Panic disorder: external & interoceptive CSs to panic symptoms → anticipatory anxiety.

    • PTSD: CS retrieves both emotional & sensory aspects → flashbacks.

  • Pavlovian-Instrumental Transfer:

    • Fear CS increases avoidance responses; drug CS motivates drug seeking.

Nature of the Learning Process

  • Conditioning depends on information value, not simple pairing.

    • Blocking: prior CS predicts US → second CS adds no info → no learning.

    • Relative contingency: P(US|CS) > P(US|\lnot CS) needed for excitation; reverse for inhibition.

  • Conditioned Inhibition: CS predicts "no US"; clinically may suppress pathological CR until inhibition lost.

  • Learning captured by Rescorla–Wagner rule: \Delta V_{CS}=\alpha\beta(\lambda-\Sigma V) (prediction-error driven; both positive & negative values).

  • Other modulators:

    • Salience, novelty (latent inhibition, US pre-exposure), attention, surprise.

    • Higher-order variants: sensory preconditioning (B→A; A→US ⇒ B elicits CR), second-order conditioning (A→US; A→B ⇒ B CR), intra-event associations (onset of panic predicting full attack).

    • Observational (vicarious) conditioning & evolutionary preparedness (snakes > flowers triage).

Erasing/Modifying Pavlovian Memories

  • Extinction: repeated CS without US; basis of exposure therapy.

  • Counter-conditioning: CS paired with incompatible US (systematic desensitization).

  • Extinction & counter-conditioning = new inhibitory learning; original memory intact ⇒ relapse phenomena:

    • Spontaneous recovery (time).

    • Renewal (context switch), reinstatement (US re-exposure), rapid reacquisition.

  • Context defined broadly (Table 3.3-1): exteroceptive, internal drug/hormone states, moods, time, recent events.

  • State-dependent extinction: benzodiazepine- or alcohol-present extinction renews when sober; clinical caution re drug-assisted exposure.

  • Pharmacological enhancers of extinction (e.g., D-cycloserine acting at NMDA receptors) speed learning but do NOT prevent renewal.

  • Reconsolidation interference: Reactivate memory → protein-synthesis blocker (anisomycin) can attenuate CR; effects may still show recovery.

  • Practical direction: accept persistent memory; design multi-context, cue-retrieval, relapse-prevention strategies.

Operant / Instrumental Conditioning

Relation of Behavior to Payoff

  • Extinction parallels classical: lever-press without pellets ⇒ decline, but with spontaneous recovery, renewal, reinstatement, rapid reacquisition.

  • Resurgence: extinguish old R, teach replacement R, then extinguish replacement ⇒ old R resurfaces.

  • Schedules of reinforcement:

    • Ratio (fixed, variable) → behavior contingent on response count; variable ratio yields high rates (slot machines).

    • Interval (fixed, variable) → first response after time; variable interval ⟶ steady moderate rates (email checking).

  • Quantitative law of effect: B1 = K\frac{R1}{R1+RO}

    • Increase behavior by ↑R1 or ↓RO.

    • Protective factor: rich alternative reinforcement (sports, hobbies) lowers drug/alcohol uptake.

  • Choice, delay discounting (Fig 3.3-5): value decreases hyperbolically with delay; impulsivity arises when immediate small reward outweighs larger delayed.

    • Commit early; augment delayed value; add costs to immediate reward for self-control interventions.

Theories of Reinforcement

  • Skinner’s empirical definition: reinforcer = any consequence that increases response.

  • Premack Principle:

    • High-probability activity reinforces low-probability activity.

    • Preference test discovers idiosyncratic reinforcers.

    • Deprivation can invert hierarchy.

  • Token economies, social praise, attention as conditioned reinforcers.

Motivational Factors & Incentive Learning

  • Reinforcer shifts produce contrast effects.

    • Positive contrast (small→large) ↑ responding.

    • Negative contrast (large→small) ⇣ responding; involves frustration.

  • Partial-reinforcement extinction effect: intermittent history → greater persistence.

  • Incentive learning (Balleine 1992): motivational state invigorates action only after experiencing outcome in that state.

    • Clinical extrapolations: withdrawal must be paired with drug relief to motivate seeking; depressed patients must learn which activities elevate depressed mood.

Pavlovian & Operant Together

Two-Factor / Avoidance Theory

  • Pavlovian CS → fear; instrumental R → removes CS/fear (negative reinforcement).

  • Clinical analogs: compulsive washing, agoraphobic avoidance, bulimic purging.

  • Species-Specific Defensive Reactions (SSDRs): innate actions (freeze, flee) learned rapidly without explicit reinforcement; show importance of Pavlovian control.

  • Learned helplessness: uncontrollable vs controllable shock alters future escape learning; currently viewed as stress-modulation by controllability.

Stimulus Control, Sign-Tracking & Habit

  • Many operants are actually Pavlovian sign-tracking (pigeon key-peck, punished lever predicting shock).

  • Synthetic associative structure (Fig 3.3-6):

    • S-O (Pavlovian), R-O (goal-directed), S-R (habit), & S→(R-O) occasion setting.

  • Reinforcer devaluation (Fig 3.3-7): devalued food suppresses its linked action ⇒ evidence for R-O cognition.

  • Occasion setting: stimuli specify which R produces which O (noise vs light setting for lever vs chain contingencies).

  • Categorization studies: multiple exemplars improve generalization—train in varied contexts to maximize transfer.

  • Habit formation: extensive practice → S-R dominates; accelerated by drugs of abuse; but cognitive routes persist and can re-emerge given context/attention shifts.

Extinction, Context & Relapse in Operant Learning

  • Same relapse patterns as Pavlovian (spontaneous recovery, renewal, reinstatement, resurgence).

  • Extinction context specificity advises conducting therapy across multiple settings and incorporating retrieval cues.

Ethical, Philosophical & Practical Implications

  • Behavioral laws are value-neutral; evolutionarily adaptive mechanisms can generate maladaptive modern behaviors (overeating, addiction).

  • Therapists must assess whether problematic acts are respondent-elicited or operant-maintained to choose appropriate antecedent vs consequence interventions.

  • Pharmacological adjuncts (e.g., NMDA modulators) should be integrated with robust behavioral principles to avoid state-dependent pitfalls.

  • Prevention: enrich natural reinforcement environments (high R_O) to buffer against emergence of pathological operants.

Numerical & Formal References

  • Law of Effect decision matrix (Fig 3.3-1) conceptual.

  • Sign-tracking approach/withdrawal matrix (Fig 3.3-2).

  • Blocking & contingency demonstration (Fig 3.3-3).

  • Quantitative law of effect curve (Fig 3.3-4) & delay discounting curves (Fig 3.3-5).

  • Rescorla–Wagner equation: \Delta V_{CS}=\alpha\beta(\lambda-\Sigma V) .

  • Reinforcement choice equation: B1 = K \frac{R1}{R1+RO}.

Connections to Foundational & Contemporary Research

  • Rescorla–Wagner (1972) foundation of error-driven learning.

  • Contemporary models include attention, memory priming, surprise (Pearce-Hall, Mackintosh, comparator hypotheses).

  • NMDA-dependent plasticity underlies extinction (D-cycloserine translational work by Davis et al., 2006).

  • Corticostriatal circuits differentiate goal-directed vs habitual control (Balleine & O’Doherty, 2010).

  • Inhibitory-learning model of exposure (Craske et al., 2014) integrates context & expectancy principles.

  • Contingency-management (Davis et al., 2016) applies quantitative law of effect to substance use treatment.

Study Tips & Mnemonics

  • "ABC" of behavior analysis = Antecedent (Pavlovian S), Behavior (R), Consequence (O).

  • BLOCKING: "Old predictor blocks new pretender."

  • PREMACK: “Grandma’s Rule” – first eat veggies (low-prob), then ice-cream (high-prob).

  • EXTINCTION relapse acronym "SNiRReR": Spontaneous recovery, (context) Novel-renewal, Reinstatement, Rapid reacquisition.

Potential Exam Equations & Figures to Reproduce

  • \Delta V_{CS}=\alpha\beta(\lambda-\Sigma V) (Rescorla–Wagner)

  • B1 = K \frac{R1}{R1+RO} (Quantitative law of effect)

  • Hyperbolic delay discounting: V=\frac{A}{1+kD} where A = amount, D = delay, k = discount rate.

Integrated Clinical Checklist

  • Identify if presenting behavior is respondent (CS-elicited) or operant (consequence-maintained).

  • Map antecedent CSs, operant Rs, outcomes Os, and context factors.

  • Consider potential blocking/inhibition histories that may influence current learning capacity.

  • During exposure:

    • Vary contexts; use expectancy violation; monitor spontaneous recovery.

  • During reinforcement-based interventions:

    • Schedule reinforcement (variable ratio for acquisition, thinning gradually).

    • Bolster alternative reinforcers to manipulate R_O.

    • Use Premack assessments to individualize rewards.

  • Address habits:

    • Increase mindfulness/attention to shift S-R back to R-O control.

    • Introduce costs/punishers to degrade habit value.

  • Plan relapse-prevention by embedding retrieval cues and practicing in high-risk contexts.