Lecture Notes on Learning Theories: Nonassociative, Classical Conditioning, and Operant Conditioning

Lecture Notes on Learning Theories: Nonassociative, Associative, and Operant Conditioning

Overview and scope
- Today’s content centers on nonassociative learning and associative learning with two stimuli, i.e., classical conditioning. Observational learning is acknowledged but not covered in depth due to time; readers should consult textbook sections on observational learning.
- Historical context: Learning theory emerged in the early 20th century, strongly tied to behaviorism, which focused on stimuli and responses and treated mind as a black box. The aim was to formulate universal learning principles applicable across species, emphasizing environment-driven behavior over internal mental states. Current understanding recognizes limits to this view, but many principles remain valid.
- Three major types of learning discussed: nonassociative learning, classical conditioning (associative between two stimuli), and operant conditioning (behavior and its consequences). The two-stimulus associative form is the core focus of the lecture.
Nonassociative learning: habituation and sensitization
- Definition: Learning that occurs from repeated exposure to a single stimulus, altering the strength of the response.
- Habituation
- Core idea: Decrease in behavioral response to a repeatedly presented, non-threatening stimulus.
- Key distinction from sensory adaptation: You still perceive the stimulus, but you stop responding due to learning that it is not important or changing.
- Example: A constant air-conditioning hum may annoy at first, but after repeated exposure, you stop attending to it; attention re-emerges if the stimulus changes (a sudden stop can startle).
- Cognitive interpretation: The system learns that the stimulus is not informative about future events.
- Use in research: Habituation is used with infants to test perceptual discrimination by measuring response decrement to repeated stimuli.
- Sensitization
- Core idea: Increase in response to a stimulus, typically when the stimulus is threatening or painful.
- Opposite pattern to habituation; intensifies responsiveness when the stimulus is salient or harmful.
- Relationship to memory and attention
- Both are simple forms of learning deeply tied to memory of prior exposures and the predictive value of the stimulus.
- Practical implication
- Habituation allows focus on new information; constant, unchanging stimuli become less attended to over time.
Associative learning: introduction to classical conditioning
- Definition: A form of learning where a neutral stimulus becomes associated with a stimulus that already elicits a response, to produce a conditioned response.
- Key terms
- Unconditioned stimulus (US): Naturally elicits a response (e.g., meat powder causing salivation).
- Unconditioned response (UR): The natural reflex (salivation to US).
- Conditioned stimulus (CS): Neutral stimulus that, after pairing with US, elicits a response.
- Conditioned response (CR): The learned response to the CS.
- Classical conditioning basics (Pavlovian framework)
- Initial setup: A neutral stimulus (CS) is paired repeatedly with an unconditioned stimulus (US) that naturally elicits a response (UR).
- After acquisition, the CS alone elicits a conditioned response (CR).
- Example: Pavlov’s dogs salivate to meat (US → UR); a bell (CS) paired with meat eventually elicits salivation (CR).
- Key paradigmatic insights and consequences
- Second-order conditioning: A CS1 is paired with a CS2, which has become a CS; CS2 can then evoke the CR on its own.
- Extinction: Repeated presentation of the CS without the US reduces the CR over time; the association weakens.
- Spontaneous recovery: After extinction, a rest period can lead to a reappearance of the CR upon CS re-presentation.
- Stimulus generalization: The CR transfers to stimuli similar to the CS (e.g., a dog conditioned to respond to 1000 Hz tone will also respond to nearby frequencies).
- Discrimination learning: With training, the generalization gradient becomes narrower, allowing the organism to distinguish between similar stimuli and respond mainly to the precise CS.
- Contingency and informativity: Conditioning is most robust when the CS provides information about the probability of the US. A CS with predictive value (P(US|CS) > P(US|not CS)) leads to stronger conditioning; a non-predictive CS yields weaker or no conditioning.
- Timing of CS–US pairing
  - Conditioning is strongest when the CS precedes the US (precedes in time) rather than simultaneous or the US preceding the CS.
  - This temporal arrangement makes the CS a warning signal for the upcoming US rather than a mere coincident event.
- Classical conditioning as a move beyond reflexes
- The mind is not simply a passive receiver of reflexes; expectations and preparation play a role in conditioned responses.
- Key examples and implications
- Fear and anxiety conditioning: Repeated exposure to a car accident (US) paired with cues (e.g., squeaky brakes or the scene) can condition fear responses to those cues (CS).
- Hospital shots and fear: Medical contexts where neutral cues (nurse’s words like 'don’t worry, this won’t hurt') become predictors of pain, leading to conditioned fear responses.
- Music and emotion: A neutral musical cue followed by an aversive event can come to evoke anxious or fearful responses when heard later.
- Everyday life marketing and media: Conditioning concepts explain how brands create emotional associations with products; the CS predicts US-related feelings.
- The Little Albert study (Watson & Rayner)
- Description: A famous, ethically controversial demonstration of classical conditioning in a child: a neutral stimulus (white rat) paired with a loud noise to induce fear.
- Outcome: Albert developed fear responses to the rat and generalized fear to similar stimuli (e.g., a rabbit).
- Ethical concerns: Highly problematic by modern standards; taught as a cautionary example of research ethics and the potential for abuse in behaviorist approaches.
- Takeaway: Demonstrates how easily a fear response can be learned and generalized; highlights limitations of aggressive behaviorist methods.
- Biological constraints on conditioning (Garcia & Koelling, 1966)
- Taste aversion: Rats conditioned to avoid a taste after illness (US) but not conditioned to avoid a visual or auditory cue paired with illness.
- Different pairing with different USs: Illness tends to pair with taste; shocks or pain pair with visual/auditory cues.
- Implication: Not all stimuli can be equally conditioned with all USs; there are species-specific and stimulus-type predispositions that constrain learning.
- Contingency and predictive value in conditioning
- A CS must be informative about the likelihood of the US to produce conditioning.
- Example design: In a baseline condition, a bell or no bell is followed by a shock with probability 0.4 in both cases (no contingency). When the bell is predictive (P(Shock|Bell) > P(Shock|NoBell)), conditioning occurs; the CS becomes a predictor of the US.
- Summary: Conditioning is driven by the expectancy of the upcoming US, not merely by temporal contiguity.
- Cognitive reinterpretation of conditioned responses
- CS-producing the CR is not identical to simply mimicking UR; it is a preparatory reaction reflecting anticipation of the US.
- Different CS–UR pairings can yield distinct conditioned responses (e.g., fear conditioning often results in heart-rate deceleration and freezing rather than the UR pattern).
- Applications: Drug conditioning and tolerance
- Conditioned compensatory responses: Drug use often occurs in specific contexts; cues associated with drug use (syringe, paraphernalia, room) become conditioned stimuli that provoke conditioned reactions opposite to the drug’s unconditioned effects (e.g., dysphoria, increased sensitivity to pain).
- Consequences: This conditioning helps explain tolerance, craving, and increased overdose risk in unfamiliar environments lacking the conditioning cues.
Basic mechanisms that broaden beyond the lab: bridging to real-world scenarios
- Conditioning and advertising: Conditioning principles are used to evoke positive feelings with certain brands; the conditioned stimulus (logo/jingle) pairs with unconditioned emotional responses (pleasure, warmth) via marketing stimuli.
- Contingency and consumer expectations: Advertisers aim to create informative cues that reliably predict positive outcomes to strengthen consumer associations.
Transition to operant conditioning (learning is about actions and consequences)
- Core idea: The relationship between a behavior and its consequences governs future likelihood of that behavior.
- Reinforcement vs punishment
- Reinforcement (increases probability of a response): positive reinforcement (adding something pleasant, e.g., food) or negative reinforcement (removing something unpleasant).
- Punishment (decreases probability of a response): positive punishment (adding something unpleasant) or negative punishment (removing something pleasant).
- The Law of Effect (Thorndike; popularized by Skinner)
- Core concept: Behavior is governed by its consequences; responses followed by rewards tend to increase in frequency.
- Classic demonstration: A hungry cat in a puzzle box must perform a behavior (pull a lever) to escape and obtain food. Across trials, the time to exit decreases, showing incremental learning without a clear insight jump.
- B.F. Skinner and operant conditioning
- Skinner boxes: Experimental chambers where an animal can perform behaviors (e.g., press a lever) to receive reinforcement (food) or avoid punishment.
- Shaping and successive approximations: A process where gradually closer behaviors to the desired target are reinforced, shaping complex actions over time.
- Primary vs secondary reinforcement
  - Primary reinforcement: Innate rewards (e.g., food, warmth).
  - Secondary reinforcement: Conditioned rewards (e.g., a tone or 'well done' that has acquired value via association with primary rewards).
- Reinforcement schedules and their effects on behavior
- Continuous reinforcement (CRF): Reward after every correct response; often rapid initial learning but quicker extinction when rewards stop.
- Partial reinforcement schedules (PRF): Not rewarding every time, which often leads to more persistent responding after reinforcement stops.
  - Fixed ratio (FR): Reward after a fixed number of responses (e.g., FR-4 after four responses).
  - Variable ratio (VR): Reward after a variable number of responses around an average (e.g., VR-4; average four responses). This typically yields high, steady response rates and is highly resistant to extinction.
  - Fixed interval (FI): Reward after the first response following a fixed time interval (e.g., FI-4 minutes).
  - Variable interval (VI): Reward after the first response following a variable time interval (average around four minutes).
- Consequences of reinforcement schedules
- Variable ratio and variable interval schedules tend to produce higher and more consistent response rates compared to fixed ratio/interval schedules.
- Real-world parallel: Gambling often relies on a variable ratio schedule, where rewards are unpredictable, leading to sustained gambling behavior.
- Partial reinforcement extinction effect (PRE)
- With partial reinforcement, extinction (cessation of reward) happens more slowly than with continuous reinforcement; the behavior persists longer once rewards stop.
- Everyday example: Crying baby and parental response
  - If parents reinforce crying intermittently (partial reinforcement), crying tends to persist longer and is harder to extinguish than if the parents always responded immediately (continuous reinforcement).
  - Practical guidance: In parenting or therapy, using consistent reinforcement strategies is important; alternating strategies can unintentionally prolong problematic behaviors.
- Cognitive perspectives and implications for learning
- Latent learning and cognitive maps: Organisms can acquire knowledge even without immediate reinforcement. For example, rats exploring a maze develop a cognitive map that aids later navigation even if they were not being reinforced at the time.
- These findings challenged strict behaviorist views, highlighting internal representations and planning.
- Contingency and control in human development
- Early studies show that a sense of control affects learning and motivation:
  - Babies who can control a mobile’s movement by their own actions show more positive affect and engagement than those who cannot.
  - Learned helplessness: When individuals experience lack of control over aversive events, they may develop passivity and depressive-like symptoms; later, when given the opportunity to control, some can relearn control but others persist in a passive state.
- Learned helplessness (classic experiment by Seligman and colleagues): Dogs exposed to unavoidable shocks later fail to escape when possible, suggesting that perceived lack of control can create enduring motivational deficits and depressive-like states in both animals and humans.
- Practical implications and ethical considerations
- Behavioral shaping and reinforcement strategies are powerful tools in education, therapy, animal training, and behavior modification.
- Ethical constraints: Historical experiments (e.g., Little Albert) illustrate the ethical concerns in psychological research; current standards emphasize minimizing harm, informed consent, and welfare.
Connections to foundational principles and real-world relevance
- The lecture emphasizes the evolution from strict behaviorism toward a more nuanced view incorporating cognitive processes, expectations, and contingency information.
- Practical relevance spans education (feedback schedules), marketing (conditioning of affective responses to brands), clinical psychology (fear conditioning, phobias), addiction science (cue-induced craving and relapse), and parenting strategies (reinforcement patterns).
Formulas and key notation (for quick reference)
- Classical conditioning basics
- US → UR
- CS + US → UR → CR after acquisition
- Extinction: CS alone reduces CR
- Spontaneous recovery: after rest, CS elicits CR again
- Contingency and information content
- Conditioning is stronger when: P(US|CS) > P(US|
  eg CS)
- Generalization and discrimination (qualitative concepts)
- Generalization: CR extends to stimuli similar to CS
- Discrimination: Narrowing of generalization to the exact CS through differential training
- Conditioning timing and predictors
- Conditioned stimulus should precede the US for effective conditioning
- Reinforcement schedules (types and effects)
- FR: reinforcement after n responses
- VR: reinforcement after an average of n responses
- FI: reinforcement after the first response after a fixed interval
- VI: reinforcement after the first response after a variable interval
- Contingency principle in Pavlovian conditioning
- A CS must provide information about the likelihood of the US to produce a CR
- Learned helplessness and control
- Perceived control influences motivation and learning outcomes; lack of control can lead to depressive-like states
Summary of key takeaways
- Habituation and sensitization are the simplest forms of nonassociative learning, shaping how we respond to repeated or salient stimuli.
- Classical conditioning demonstrates how neutral cues can acquire predictive power to elicit reflexive responses; timing, contingency, generalization, discrimination, and higher-order conditioning influence the strength and scope of learned responses.
- The Little Albert study and Garcia–Koelling experiments illustrate both the power and limits of conditioning, highlighting ethical concerns and biological constraints.
- Operant conditioning emphasizes learning through reinforcement and punishment, with variable training schedules often producing more persistent behavior than continuous reinforcement.
- Cognitive insights (latent learning, cognitive maps, contingency awareness) challenge a purely stimulus–response view, showing how expectations, control, and internal representations influence learning.
- Real-world implications span education, marketing, health, addiction, and parenting, while acknowledging the role of context, cues, and control in shaping behavior.
Suggested readings and further exploration
- Classic experiments: Pavlov’s conditioning, Skinner’s operant conditioning with Skinner boxes, Garcia & Koelling taste aversion studies, Seligman’s learned helplessness, Tolman’s latent learning and cognitive maps, Watson & Rayner’s Little Albert study (for ethical critique).
- Topics to review in the textbook: observational learning (Bandura), higher-order conditioning, extinction and spontaneous recovery mechanisms, and modern cognitive theories on conditioning and expectation.