Models are used to construct approximate theories, perform experiments and simulations, and compare experimental results, simulation results, and theoretical predictions.
The goal is to compare and improve the model and theory.
Assumes stimuli gain association with the US based solely on contiguity.
Learning occurs according to a local error reduction rule.
Learning occurs to the extent that the CS does not predict the US accurately.
The goal of learning is to reduce predictive error.
When the CS fully and completely predicts the US, then no more learning occurs.
Shows associative value increasing over trials as predictive error decreases.
∆V = (λ - Vn)
λ refers to the total amount of conditioning that a US can support on a given trial.
V refers to the associative strength of the particular cue in question.
n refers to the trial number.
(λ - V) = predictive error.
Note: λ = 1 when US is present; λ = 0 when US is absent.
Goal of learning is to get ∆V = 0 → no more predictive error (Occurrence of US is surprising)
∆V = (1 – 0.01) à predictive error is 1
∆V = (1 – 0.12) à predictive error is 0.9
∆V = (1 – 0.23) à predictive error is 0.8
…
∆V = (1 – 1.030) à predictive error is 0 à Goal of learning has been achieved, the occurrence of the US is no longer surprising. à Goal of learning is to get ∆V = 0 à no more predictive error. à the occurrence of the US is surprising.
∆V = (0 – 1.01) à predictive error is -1.0
∆V = (0 – 0.92) à predictive error is -0.9
∆V = (0 – 0.83) à predictive error is -0.8
…
∆V = (0 – 0.030) à predictive error is 0.0 à Goal of learning has been achieved, the absence of the US is no longer surprising. à Goal of learning is to get ∆V = 0 à no more predictive error à the absence of the US is surprising
The amount and type of learning that occurs to a CS depends on how surprising the US is based on that CS alone.
Most learning occurs in early trials and grows at a negatively accelerated rate.
When US is no longer surprising, learning reaches asymptote.
Cannot explain blocking (A+ | AX+) in which there is perfect contiguity between X and the US but little responding is observed relative to the control group.
Also cannot explain overshadowing (AX+), which also has perfect contiguity between A and X and the US but less responding is observed relative to elemental training.
Uses total error reduction instead of local error reduction.
Learning depends on the level of surprise of the US based on the predictive value of ALL cues present on a given trial (local error reduction is based on the predictive value of only the CS being tested).
Satisfactorily accounts for most cue competition situations including blocking and overshadowing.
∆V = αβ(λ - ∑Vn)
∑V (also noted as VT or V(present cues)) refers to the total associative strength of all CSs present on trial number n.
α is the learning rate parameter that refers to the associability of the CS.
β is the learning rate parameter that refers to the associability of the US.
Associability roughly corresponds to salience or intensity.
0 < α < 1
0 < β < 1
Formula: ∆V = (λ – ∑Vn)
∆V = (λ – [V1 + V2]n)
∆V = (1 – [0.0 + 0.0]1) à predictive error is 1.0
∆V = (1 – [0.1 + 0.1]2) à predictive error is 0.8
∆V = (1 – [0.2 + 0.2]3) à predictive error is 0.6
∆V = (1 – [0.3 + 0.3]4) à predictive error is 0.4
∆V = (1 – [0.4 + 0.4]5) à predictive error is 0.2
∆V = (1 – [0.5 + 0.5]6) à predictive error is 0.0
Goal of learning has been achieved, the occurrence of the US is no longer surprising
Goal of learning is to get ∆V = 0 à no more predictive error
When trained in compound, the associative value of all CSs cannot exceed the value of the US
X acquired negative associative strength to balance out A’s excitatory associative strength.
X = -1; A = +1
On the negative summation test, B = 1 and X = -1
On the retardation test, X = -1 and Y = 0
CI Train: A+ / AX-
Transfer Train: B+
Neg Sum Test: BX à cr
Ret Test: X+ à cr
Control
A+ / AX-
B+
B à CR
Y+ à CR
∆V = (λ – ∑Vn)
∆V = (λ – [A + X]n)
∆V = (0 – [1.0 + (0.0)]1) à predictive error is -1.0
∆V = (0 – [1.0 + (-0.1)]2) à predictive error is -0.9
∆V = (0 – [1.0 + (-0.2)]3) à predictive error is -0.8
∆V = (0 – [1.0 + (-0.3)]4) à predictive error is -0.7
∆V = (0 – [1.0 + (-0.4)]5) à predictive error is -0.6
…
∆V = (0 – [1.0 + (-1.0)]30) à predictive error is 0.0
Reinforcing two independently-trained conditioned excitors in compound will result in less responding to each CS individually relative to if you reinforced the CS with a neutral stimulus (Con1) or reinforced the two CSs elementally (Con2).
The animal expected λ = 2 during AX trials but only received λ = 1.
X acquires negative associative strength.
Group: Phase 1 / Phase 2 / Test
Exp: A+ / X+ then AX+ then X à cr
Con1: B+ / X+ then AX+ then X à CR
Con2: A+ / X+ then A+ / X+ then X à CR
∆V = (λ – ∑Vn)
∆V = (λ – [A + X]n)
∆V = (1 – [1.0 + 1.0]1) à predictive error is -1.0
∆V = (1 – [0.9 + 0.9]2) à predictive error is -0.8
∆V = (1 – [0.8 + 0.8]3) à predictive error is -0.6
∆V = (1 – [0.7 + 0.7]4) à predictive error is -0.4
∆V = (1 – [0.6 + 0.6]5) à predictive error is -0.2
∆V = (1 – [0.5 + 0.5]6) à predictive error is 0.0
Presenting either CS A or X by itself will now yield a smaller conditioned response (cr)
Goal of learning is to get ∆V = 0 à no more predictive error
When trained in compound, the associative value of all CSs cannot exceed the value of the US
Total error reduction rule in blocking (assume A was a previously trained excitor with 0.8 value):
∆V = (λ – ∑Vn)
∆V = (λ – [A + X]n)
∆V = (1 – [0.8 + 0.0]1) à predictive error is 0.2
∆V = (1 – [0.8 + 0.1]2) à predictive error is 0.1
∆V = (1 – [0.8 + 0.2]3) à predictive error is 0.0
∆V = (1 – [0.8 + 0.2]4) à predictive error is 0.0
∆V = (1 – [0.8 + 0.2]5) à predictive error is 0.0
∆V = (1 – [0.8 + 0.2]6) à predictive error is 0.0
Previously trained excitor Cue A has blocked Cue X from gaining any more excitatory value
Goal of learning is to get ∆V = 0 à no more predictive error
When trained in compound, the associative value of all CSs cannot exceed the value of the US
Total error reduction rule in overshadowing (assume A is more salient):
∆V = (λ – ∑Vn)
∆V = (λ – [A + X]n)
∆V = (1 – [0.0 + 0.0]1) à predictive error is 1.0
∆V = (1 – [0.4 + 0.1]2) à predictive error is 0.5
∆V = (1 – [0.8 + 0.2]3) à predictive error is 0.0
∆V = (1 – [0.8 + 0.2]4) à predictive error is 0.0
∆V = (1 – [0.8 + 0.2]5) à predictive error is 0.0
∆V = (1 – [0.8 + 0.2]6) à predictive error is 0.0
More salient Cue A has overshadowed not-so-salient Cue X from gaining any more excitatory value
Goal of learning is to get ∆V = 0 à no more predictive error
When trained in compound, the associative value of all CSs cannot exceed the value of the US
Spontaneous recovery
Latent inhibition (CS pre-exposure)
Pav C.I. Train: A+ / AX-
CI Ext: X-
Test
Exp: X à Con
Con: X à cr
Rescorla-Wagner model during Phase CI Ext: ∆V = (0 – [-1.0]1) à predictive error is 1.0.
PavCI Train: A+ / AX-
Acq: XY-
Test
Exp: Y à Con
Con: Y à cr
R-W model during Phase Acq: ∆V = (λ – [X + Y]n)
∆V = (0 – [-1.0 + 0.0]1) à predictive error is 1.0
∆V = (0 – [-1.0 + 0.1]2) à predictive error is 0.9
∆V = (0 – [-1.0 + 0.2]3) à predictive error is 0.8
…
∆V = (0 – [-1.0 + 1.0]30) à predictive error is 0.0
Change in response to the target CS as a function of manipulating the associative status of a related CS.
Example:
Phase 1: A+
Phase 2: AX+
Phase 3: A-
Test: X à CR
The Rescorla-Wagner model during Phase 2:
∆V = (1 – [0.8 + 0.2]30) à predictive error is 0.0
Not predicted by R-W model
Rescorla-Wagner model assumes that changes in the associative status of the CS occur ONLY when the CS is present, i.e., α > 0.
Retrospective revaluation shows changes in the associative status of the target CS when it is absent, i.e., α = 0.
Non-competitive learning.
Contiguity is necessary and sufficient for learning.
We associate anything that is presented together (i.e., has good contiguity), but we don’t express all learning.
The predictive value of a CS is compared to all other stimuli it is associated with, which are also associated with the US.
Other CSs are called “comparator stimuli”.
Behaviour dependent on relative status of stimuli at time of testing.
If the target CS (X) is more strongly associated with the US than other comparator stimuli, then responding to X is strong.
If another comparator stimulus (Y) is more strongly with the US than target CS X, then responding to X is weak.
Which CS is more strongly associated (and therefore will more strongly control behaviour) is determined by the comparator process.
Involves direct and indirect activation of the US representation.
Target CS (X) directly activates US representation. (Link1)
Comparator stimulus/representation (Link2) context indirectly activates US representation.(Link3)
Comparator term calculated via Link 2 x Link 3.
Phase 1: 5XA+
Phase 2: 20A+
Link 1 does not get weaker, but it is weak relative to the comparator term (links 2 and 3).
The context, due to its lower salience, becomes a second-order comparator stimulus.
Phase 1: A+
Phase 2: AX+
Phase 3: A-
Test: X à Cr
X: Directly activated US representation
A: Comparator process
Indirectly activated US representation via Links 1, 2, and 3
Phase 1: AX+
Phase 2: A-
Test: X à Cr
Directly activated US representation
Comparator process
Indirectly activated US representation
Mackintosh's (1975) Model
Pearce & Hall's (1980) Model
The outcome of a trial will determine how much attention is given to the CS on the next trial.
Attention modulates learning.
Proposed that attention is determined by how surprising the US was on the preceding trial.
More surprising means more attention is paid on the subsequent trial.
Proposed that attention increases to cues that are reliable predictors of the US.
Looking for action – attention that a stimulus commands after it has become a good predictor of the US and can generate a CR with minimal cognitive effort.
Similar to Mackintosh’s attentional mechanism.
Looking for learning – attention that is involved in processing cues that are not yet good predictors of the US and therefore have much to be learned about.
Similar to Pearce & Hall’s attentional mechanism.
Looking for liking – attention that stimuli command because of their emotional value.
The CS-US interval (ISI) is important for strength and rate of learning and CR (think about trace conditioning).
When something occurs is equally as important as what occurs.
The intertrial interval (ITI) can influence conditioning.
Generally, longer ITIs lead to better conditioning than massed trials (i.e., short ITIs).
ITI and ISI interact to determine responding.
Responding is determined by the length of the ISI relative to the length of the ITI.
Not absolute value.
Organisms compare how long they have to wait for the US during the CS (T) relative to how long they wait for the US during the intertrial interval (I).
When the US waiting time during the CS is shorter than during the ITI, the I/T ratio is high.
The CS becomes an informative predictor of the next occurrence of the US.
Strong responding to the CS is observed.
When the US waiting time during the CS is longer than or similar to during the ITI, the I/T ratio is low.
The CS provides little information about the next US.
Weak responding to the CS is observed.
The idea that organisms compare the relative predictiveness/informative value of the CS and ITI at the time of testing is similar to the idea of the Comparator Hypothesis.
Time is encoded as part of the association between two stimuli.
We know this from observation of inhibition of delay.
Studies suggest that organisms do acquire CS-US associations in simultaneous and backward conditioning procedures.
But this learning is only expressed in a predictive relationship.
Responding to a CS must be assessed by an anticipatory measure.
Associative learning theories are simplified models of how we learn and how that learning determines behaviour.
Rescorla & Wagner assumes learning occurs in accordance to how surprising the outcome is based on the associative value of all stimuli present.
Comparator Hypothesis is a model of performance. It assumes that stimuli compete at the time of testing for behavioural control based on their relative informative value.
Attention models assume that the outcome of a trial will determine attention, which modulates learning on subsequent trials.
Timing models incorporate time into the learned association, which influences behaviour.