Classical Conditioning Theory Part 2

Classical-conditioning learning rule formalised in the early 1970s.
Canonical equation: $\Delta V = \alpha \beta (\lambda - V)$
- $V$ = current associative strength.
- $\lambda$ = maximum associative strength the US can support.
- $\alpha,\,\beta$ = salience / learning-rate parameters for CS and US.
Essence: use feedback error ("what happened minus what was expected") to update associations.
Idea has been broadened far beyond animal conditioning into modern machine-learning, neural-prosthetics and artificial intelligence.

Participant: individual with high-level spinal-cord injury (no voluntary hand/leg movement).
Implant: “Utah” micro-electrode array (square, < penny-sized) inserted into arm/hand region of sensorimotor cortex.
- Dozens of needle-like electrodes record population spiking patterns.
Task progression
1. Cursor control phase (first ~70 days):
- Cursor starts at screen centre; participant tries to move it to illuminated goal targets.
- Machine-learning algorithm (foundation = Rescorla–Wagner-style error correction) maps cortical patterns → 2-D cursor velocity.
- Continuous feedback: If generated pattern matches stored template for “up”, cursor moves up etc.
- Performance visibly improves across training; occasional mistrials illustrate online model updating.
1. Prosthetic-hand phase:
- Same decoded signals routed to a robotic gripper.
- Participant learns to open/close hand, grasp objects ⇒ tangible improvement in activities of daily living.
Significance
- Demonstrates feedback-driven plasticity at cortical level.
- Provides real-world validation of error-based learning rules in neuro-prosthetics.
- Opens pathway toward combining recording + eventual stimulus-based rehabilitation (discussed later).

Watson trains via massive corpus of historical Jeopardy! Q&A pairs ("training trials").
Straight keyword search failed (many absurd answers) → needed weight-adjustment algorithm based on expected vs obtained outcome.
Rescorla–Wagner analogy:
- Each candidate answer has weight $V<em>i$ ; feedback from correct/incorrect updates $V</em>i$ .
Applications: meteorology (Weather Channel forecasting), customer-service chat-bots, medical diagnosis.
Pedagogical note: Documentary + guided questions are examinable content.

First human lab (1920s, pre-IRB): Clark Hull (in visor) slaps grad-student Ernest Hilgard’s face after a tone to induce blink.
Modern humane protocol (rabbits, humans):
- CS = pure tone.
- US = mild corneal air-puff.
- CR = anticipatory eyeblink.
Widely adopted because:
- Simple, quantifiable, high trial-throughput.
- Cerebellar circuitry well mapped, enabling fine neurobiological analysis.
- Diagnostic probe for clinical disorders affecting cerebellum, brainstem, or learning processes.

Tone activates auditory relay → pontine nuclei.
Pontine sends:
- Excitatory collaterals to interpositus nucleus (deep cerebellar nucleus).
- Mossy-fibre projections to cerebellar cortex → granule cells → Purkinje cells.
Purkinje cells exert inhibitory ( $\text{GABAergic}$ ) influence on interpositus.

Air-puff triggers trigeminal reflex AND ascends via brainstem → inferior olive.
Climbing-fibre outputs excite both Purkinje cells and interpositus nucleus.

Once interpositus activity surpasses threshold, projects to red nucleus / cranial-facial motor nuclei → eyelid muscles.

Purkinje cells (receive CS via mossy + US via climbing fibres).
Interpositus nucleus (receives direct excitatory input from both pathways + inhibitory gating from Purkinje).

Focal electrolytic or pharmacological lesion of interpositus:
- Retrograde amnesia: previously learned CR abolished although reflex blink remains intact.
- Anterograde amnesia: post-lesion animals cannot acquire new CS–US association.
Purkinje cells distributed across cortex → global lesion impossible; later genetic knock-out / mutation models show parallel learning deficits, confirming their necessity.

Pre-training: CS evokes little firing; US gives small burst; blink occurs after US (pure reflex).
Post-training: CS alone produces gradual ramp-up of spikes peaking just before predicted US; blink now anticipatory.

Baseline: high tonic firing (inhibitory).
During CS after training: marked pause in firing ("Purkinje pause") coincident with interpositus ramp → removes inhibition → disinhibits interpositus → CR triggered.
After US time-point: Purkinje firing gradually resumes, reinstating inhibition.

Learning corresponds to CS-induced, time-specific decrease in Purkinje activity + complementary increase in interpositus firing.
Temporal precision encodes inter-stimulus interval.

Test whether mere activation of neural pathways (no external tone or air-puff) is sufficient for acquisition, extinction, inhibition, spontaneous recovery.

US Replacement (Mach 1986)
- Group A: real air-puff (US).
- Group B: electrical stimulation (ES) of inferior olive (start of US pathway).
- Acquisition curves identical → ES-US is functionally equivalent.
CS Replacement
- ES-pontine nuclei used as CS; real air-puff as US.
- Rabbits acquired CR; showed normal extinction when ES-CS presented alone; could undergo inhibitory conditioning (CS− trials reduce responding).
Full Internal Pairing
- ES-pontine (CS) + ES-inferior-olive (US) paired with appropriate temporal offset.
- Animals never experience tone or puff, yet develop robust CR to pontine-stimulation alone.
- Exhibit standard phenomena: extinction, spontaneous recovery, reacquisition, delayed re-learning after explicit CS− training.

Learning rule operates on neural-activity patterns irrespective of sensory modality.
Supports feasibility of therapeutic “electrical rehabilitation” paradigms (e.g., spinal-cord plasticity, BCI closed-loop stimulation).

Excitatory cells (e.g., mossy-fibre targets, interpositus neurons): low baseline; fire bursts only when driven.
Inhibitory cells (Purkinje): high tonic rate; learning often manifests as decrease in their activity.
Net effect during CS after conditioning = shift of excitation/inhibition balance favouring motor output.

Eye-blink paradigm foundational for dissecting memory engrams; parallels seen in fear conditioning, habit learning, addiction models.
Ethical transition from face-slap (1920s) → humane air-puff underscores evolution of research oversight (IRB protocols).
Electrical-pathway conditioning foreshadows questions about autonomy, consent and potential misuse of neuro-stimulation technologies ("Matrix" analogy).

Builds upon earlier classical-conditioning concepts (CS, US, CR, extinction, spontaneous recovery, inhibitory conditioning).
Next lecture will extend conditioning framework to drug tolerance, dependence, and relapse.