Lecture Notes on Learning Theories: Nonassociative, Classical Conditioning, and Operant Conditioning
Lecture Notes on Learning Theories: Nonassociative, Associative, and Operant Conditioning
Overview and scope
Today’s content centers on nonassociative learning and associative learning with two stimuli, i.e., classical conditioning. Observational learning is acknowledged but not covered in depth due to time; readers should consult textbook sections on observational learning.
Historical context: Learning theory emerged in the early 20th century, strongly tied to behaviorism, which focused on stimuli and responses and treated mind as a black box. The aim was to formulate universal learning principles applicable across species, emphasizing environment-driven behavior over internal mental states. Current understanding recognizes limits to this view, but many principles remain valid.
Three major types of learning discussed: nonassociative learning, classical conditioning (associative between two stimuli), and operant conditioning (behavior and its consequences). The two-stimulus associative form is the core focus of the lecture.
Nonassociative learning: habituation and sensitization
Definition: Learning that occurs from repeated exposure to a single stimulus, altering the strength of the response.
Habituation
Core idea: Decrease in behavioral response to a repeatedly presented, non-threatening stimulus.
Key distinction from sensory adaptation: You still perceive the stimulus, but you stop responding due to learning that it is not important or changing.
Example: A constant air-conditioning hum may annoy at first, but after repeated exposure, you stop attending to it; attention re-emerges if the stimulus changes (a sudden stop can startle).
Cognitive interpretation: The system learns that the stimulus is not informative about future events.
Use in research: Habituation is used with infants to test perceptual discrimination by measuring response decrement to repeated stimuli.
Sensitization
Core idea: Increase in response to a stimulus, typically when the stimulus is threatening or painful.
Opposite pattern to habituation; intensifies responsiveness when the stimulus is salient or harmful.
Relationship to memory and attention
Both are simple forms of learning deeply tied to memory of prior exposures and the predictive value of the stimulus.
Practical implication
Habituation allows focus on new information; constant, unchanging stimuli become less attended to over time.
Associative learning: introduction to classical conditioning
Definition: A form of learning where a neutral stimulus becomes associated with a stimulus that already elicits a response, to produce a conditioned response.
Key terms
Unconditioned stimulus (US): Naturally elicits a response (e.g., meat powder causing salivation).
Unconditioned response (UR): The natural reflex (salivation to US).
Conditioned stimulus (CS): Neutral stimulus that, after pairing with US, elicits a response.
Conditioned response (CR): The learned response to the CS.
Classical conditioning basics (Pavlovian framework)
Initial setup: A neutral stimulus (CS) is paired repeatedly with an unconditioned stimulus (US) that naturally elicits a response (UR).
After acquisition, the CS alone elicits a conditioned response (CR).
Example: Pavlov’s dogs salivate to meat (US → UR); a bell (CS) paired with meat eventually elicits salivation (CR).
Key paradigmatic insights and consequences
Second-order conditioning: A CS1 is paired with a CS2, which has become a CS; CS2 can then evoke the CR on its own.
Extinction: Repeated presentation of the CS without the US reduces the CR over time; the association weakens.
Spontaneous recovery: After extinction, a rest period can lead to a reappearance of the CR upon CS re-presentation.
Stimulus generalization: The CR transfers to stimuli similar to the CS (e.g., a dog conditioned to respond to 1000 Hz tone will also respond to nearby frequencies).
Discrimination learning: With training, the generalization gradient becomes narrower, allowing the organism to distinguish between similar stimuli and respond mainly to the precise CS.
Contingency and informativity: Conditioning is most robust when the CS provides information about the probability of the US. A CS with predictive value (P(US|CS) > P(US|not CS)) leads to stronger conditioning; a non-predictive CS yields weaker or no conditioning.
Timing of CS–US pairing
Conditioning is strongest when the CS precedes the US (precedes in time) rather than simultaneous or the US preceding the CS.
This temporal arrangement makes the CS a warning signal for the upcoming US rather than a mere coincident event.
Classical conditioning as a move beyond reflexes
The mind is not simply a passive receiver of reflexes; expectations and preparation play a role in conditioned responses.
Key examples and implications
Fear and anxiety conditioning: Repeated exposure to a car accident (US) paired with cues (e.g., squeaky brakes or the scene) can condition fear responses to those cues (CS).
Hospital shots and fear: Medical contexts where neutral cues (nurse’s words like 'don’t worry, this won’t hurt') become predictors of pain, leading to conditioned fear responses.
Music and emotion: A neutral musical cue followed by an aversive event can come to evoke anxious or fearful responses when heard later.
Everyday life marketing and media: Conditioning concepts explain how brands create emotional associations with products; the CS predicts US-related feelings.
The Little Albert study (Watson & Rayner)
Description: A famous, ethically controversial demonstration of classical conditioning in a child: a neutral stimulus (white rat) paired with a loud noise to induce fear.
Outcome: Albert developed fear responses to the rat and generalized fear to similar stimuli (e.g., a rabbit).
Ethical concerns: Highly problematic by modern standards; taught as a cautionary example of research ethics and the potential for abuse in behaviorist approaches.
Takeaway: Demonstrates how easily a fear response can be learned and generalized; highlights limitations of aggressive behaviorist methods.
Biological constraints on conditioning (Garcia & Koelling, 1966)
Taste aversion: Rats conditioned to avoid a taste after illness (US) but not conditioned to avoid a visual or auditory cue paired with illness.
Different pairing with different USs: Illness tends to pair with taste; shocks or pain pair with visual/auditory cues.
Implication: Not all stimuli can be equally conditioned with all USs; there are species-specific and stimulus-type predispositions that constrain learning.
Contingency and predictive value in conditioning
A CS must be informative about the likelihood of the US to produce conditioning.
Example design: In a baseline condition, a bell or no bell is followed by a shock with probability 0.4 in both cases (no contingency). When the bell is predictive (P(Shock|Bell) > P(Shock|NoBell)), conditioning occurs; the CS becomes a predictor of the US.
Summary: Conditioning is driven by the expectancy of the upcoming US, not merely by temporal contiguity.
Cognitive reinterpretation of conditioned responses
CS-producing the CR is not identical to simply mimicking UR; it is a preparatory reaction reflecting anticipation of the US.
Different CS–UR pairings can yield distinct conditioned responses (e.g., fear conditioning often results in heart-rate deceleration and freezing rather than the UR pattern).
Applications: Drug conditioning and tolerance
Conditioned compensatory responses: Drug use often occurs in specific contexts; cues associated with drug use (syringe, paraphernalia, room) become conditioned stimuli that provoke conditioned reactions opposite to the drug’s unconditioned effects (e.g., dysphoria, increased sensitivity to pain).
Consequences: This conditioning helps explain tolerance, craving, and increased overdose risk in unfamiliar environments lacking the conditioning cues.
Basic mechanisms that broaden beyond the lab: bridging to real-world scenarios
Conditioning and advertising: Conditioning principles are used to evoke positive feelings with certain brands; the conditioned stimulus (logo/jingle) pairs with unconditioned emotional responses (pleasure, warmth) via marketing stimuli.
Contingency and consumer expectations: Advertisers aim to create informative cues that reliably predict positive outcomes to strengthen consumer associations.
Transition to operant conditioning (learning is about actions and consequences)
Core idea: The relationship between a behavior and its consequences governs future likelihood of that behavior.
Reinforcement vs punishment
Reinforcement (increases probability of a response): positive reinforcement (adding something pleasant, e.g., food) or negative reinforcement (removing something unpleasant).
Punishment (decreases probability of a response): positive punishment (adding something unpleasant) or negative punishment (removing something pleasant).
The Law of Effect (Thorndike; popularized by Skinner)
Core concept: Behavior is governed by its consequences; responses followed by rewards tend to increase in frequency.
Classic demonstration: A hungry cat in a puzzle box must perform a behavior (pull a lever) to escape and obtain food. Across trials, the time to exit decreases, showing incremental learning without a clear insight jump.
B.F. Skinner and operant conditioning
Skinner boxes: Experimental chambers where an animal can perform behaviors (e.g., press a lever) to receive reinforcement (food) or avoid punishment.
Shaping and successive approximations: A process where gradually closer behaviors to the desired target are reinforced, shaping complex actions over time.
Primary vs secondary reinforcement
Primary reinforcement: Innate rewards (e.g., food, warmth).
Secondary reinforcement: Conditioned rewards (e.g., a tone or 'well done' that has acquired value via association with primary rewards).
Reinforcement schedules and their effects on behavior
Continuous reinforcement (CRF): Reward after every correct response; often rapid initial learning but quicker extinction when rewards stop.
Partial reinforcement schedules (PRF): Not rewarding every time, which often leads to more persistent responding after reinforcement stops.
Fixed ratio (FR): Reward after a fixed number of responses (e.g., FR-4 after four responses).
Variable ratio (VR): Reward after a variable number of responses around an average (e.g., VR-4; average four responses). This typically yields high, steady response rates and is highly resistant to extinction.
Fixed interval (FI): Reward after the first response following a fixed time interval (e.g., FI-4 minutes).
Variable interval (VI): Reward after the first response following a variable time interval (average around four minutes).
Consequences of reinforcement schedules
Variable ratio and variable interval schedules tend to produce higher and more consistent response rates compared to fixed ratio/interval schedules.
Real-world parallel: Gambling often relies on a variable ratio schedule, where rewards are unpredictable, leading to sustained gambling behavior.
Partial reinforcement extinction effect (PRE)
With partial reinforcement, extinction (cessation of reward) happens more slowly than with continuous reinforcement; the behavior persists longer once rewards stop.
Everyday example: Crying baby and parental response
If parents reinforce crying intermittently (partial reinforcement), crying tends to persist longer and is harder to extinguish than if the parents always responded immediately (continuous reinforcement).
Practical guidance: In parenting or therapy, using consistent reinforcement strategies is important; alternating strategies can unintentionally prolong problematic behaviors.
Cognitive perspectives and implications for learning
Latent learning and cognitive maps: Organisms can acquire knowledge even without immediate reinforcement. For example, rats exploring a maze develop a cognitive map that aids later navigation even if they were not being reinforced at the time.
These findings challenged strict behaviorist views, highlighting internal representations and planning.
Contingency and control in human development
Early studies show that a sense of control affects learning and motivation:
Babies who can control a mobile’s movement by their own actions show more positive affect and engagement than those who cannot.
Learned helplessness: When individuals experience lack of control over aversive events, they may develop passivity and depressive-like symptoms; later, when given the opportunity to control, some can relearn control but others persist in a passive state.
Learned helplessness (classic experiment by Seligman and colleagues): Dogs exposed to unavoidable shocks later fail to escape when possible, suggesting that perceived lack of control can create enduring motivational deficits and depressive-like states in both animals and humans.
Practical implications and ethical considerations
Behavioral shaping and reinforcement strategies are powerful tools in education, therapy, animal training, and behavior modification.
Ethical constraints: Historical experiments (e.g., Little Albert) illustrate the ethical concerns in psychological research; current standards emphasize minimizing harm, informed consent, and welfare.
Connections to foundational principles and real-world relevance
The lecture emphasizes the evolution from strict behaviorism toward a more nuanced view incorporating cognitive processes, expectations, and contingency information.
Practical relevance spans education (feedback schedules), marketing (conditioning of affective responses to brands), clinical psychology (fear conditioning, phobias), addiction science (cue-induced craving and relapse), and parenting strategies (reinforcement patterns).
Formulas and key notation (for quick reference)
Classical conditioning basics
US → UR
CS + US → UR → CR after acquisition
Extinction: CS alone reduces CR
Spontaneous recovery: after rest, CS elicits CR again
Contingency and information content
Conditioning is stronger when: P(US|CS) > P(US|
eg CS)Generalization and discrimination (qualitative concepts)
Generalization: CR extends to stimuli similar to CS
Discrimination: Narrowing of generalization to the exact CS through differential training
Conditioning timing and predictors
Conditioned stimulus should precede the US for effective conditioning
Reinforcement schedules (types and effects)
FR: reinforcement after n responses
VR: reinforcement after an average of n responses
FI: reinforcement after the first response after a fixed interval
VI: reinforcement after the first response after a variable interval
Contingency principle in Pavlovian conditioning
A CS must provide information about the likelihood of the US to produce a CR
Learned helplessness and control
Perceived control influences motivation and learning outcomes; lack of control can lead to depressive-like states
Summary of key takeaways
Habituation and sensitization are the simplest forms of nonassociative learning, shaping how we respond to repeated or salient stimuli.
Classical conditioning demonstrates how neutral cues can acquire predictive power to elicit reflexive responses; timing, contingency, generalization, discrimination, and higher-order conditioning influence the strength and scope of learned responses.
The Little Albert study and Garcia–Koelling experiments illustrate both the power and limits of conditioning, highlighting ethical concerns and biological constraints.
Operant conditioning emphasizes learning through reinforcement and punishment, with variable training schedules often producing more persistent behavior than continuous reinforcement.
Cognitive insights (latent learning, cognitive maps, contingency awareness) challenge a purely stimulus–response view, showing how expectations, control, and internal representations influence learning.
Real-world implications span education, marketing, health, addiction, and parenting, while acknowledging the role of context, cues, and control in shaping behavior.
Suggested readings and further exploration
Classic experiments: Pavlov’s conditioning, Skinner’s operant conditioning with Skinner boxes, Garcia & Koelling taste aversion studies, Seligman’s learned helplessness, Tolman’s latent learning and cognitive maps, Watson & Rayner’s Little Albert study (for ethical critique).
Topics to review in the textbook: observational learning (Bandura), higher-order conditioning, extinction and spontaneous recovery mechanisms, and modern cognitive theories on conditioning and expectation.