Lecture Notes on Learning Theories: Nonassociative, Classical Conditioning, and Operant Conditioning

Lecture Notes on Learning Theories: Nonassociative, Associative, and Operant Conditioning

  • Overview and scope

    • Today’s content centers on nonassociative learning and associative learning with two stimuli, i.e., classical conditioning. Observational learning is acknowledged but not covered in depth due to time; readers should consult textbook sections on observational learning.

    • Historical context: Learning theory emerged in the early 20th century, strongly tied to behaviorism, which focused on stimuli and responses and treated mind as a black box. The aim was to formulate universal learning principles applicable across species, emphasizing environment-driven behavior over internal mental states. Current understanding recognizes limits to this view, but many principles remain valid.

    • Three major types of learning discussed: nonassociative learning, classical conditioning (associative between two stimuli), and operant conditioning (behavior and its consequences). The two-stimulus associative form is the core focus of the lecture.

  • Nonassociative learning: habituation and sensitization

    • Definition: Learning that occurs from repeated exposure to a single stimulus, altering the strength of the response.

    • Habituation

    • Core idea: Decrease in behavioral response to a repeatedly presented, non-threatening stimulus.

    • Key distinction from sensory adaptation: You still perceive the stimulus, but you stop responding due to learning that it is not important or changing.

    • Example: A constant air-conditioning hum may annoy at first, but after repeated exposure, you stop attending to it; attention re-emerges if the stimulus changes (a sudden stop can startle).

    • Cognitive interpretation: The system learns that the stimulus is not informative about future events.

    • Use in research: Habituation is used with infants to test perceptual discrimination by measuring response decrement to repeated stimuli.

    • Sensitization

    • Core idea: Increase in response to a stimulus, typically when the stimulus is threatening or painful.

    • Opposite pattern to habituation; intensifies responsiveness when the stimulus is salient or harmful.

    • Relationship to memory and attention

    • Both are simple forms of learning deeply tied to memory of prior exposures and the predictive value of the stimulus.

    • Practical implication

    • Habituation allows focus on new information; constant, unchanging stimuli become less attended to over time.

  • Associative learning: introduction to classical conditioning

    • Definition: A form of learning where a neutral stimulus becomes associated with a stimulus that already elicits a response, to produce a conditioned response.

    • Key terms

    • Unconditioned stimulus (US): Naturally elicits a response (e.g., meat powder causing salivation).

    • Unconditioned response (UR): The natural reflex (salivation to US).

    • Conditioned stimulus (CS): Neutral stimulus that, after pairing with US, elicits a response.

    • Conditioned response (CR): The learned response to the CS.

    • Classical conditioning basics (Pavlovian framework)

    • Initial setup: A neutral stimulus (CS) is paired repeatedly with an unconditioned stimulus (US) that naturally elicits a response (UR).

    • After acquisition, the CS alone elicits a conditioned response (CR).

    • Example: Pavlov’s dogs salivate to meat (US → UR); a bell (CS) paired with meat eventually elicits salivation (CR).

    • Key paradigmatic insights and consequences

    • Second-order conditioning: A CS1 is paired with a CS2, which has become a CS; CS2 can then evoke the CR on its own.

    • Extinction: Repeated presentation of the CS without the US reduces the CR over time; the association weakens.

    • Spontaneous recovery: After extinction, a rest period can lead to a reappearance of the CR upon CS re-presentation.

    • Stimulus generalization: The CR transfers to stimuli similar to the CS (e.g., a dog conditioned to respond to 1000 Hz tone will also respond to nearby frequencies).

    • Discrimination learning: With training, the generalization gradient becomes narrower, allowing the organism to distinguish between similar stimuli and respond mainly to the precise CS.

    • Contingency and informativity: Conditioning is most robust when the CS provides information about the probability of the US. A CS with predictive value (P(US|CS) > P(US|not CS)) leads to stronger conditioning; a non-predictive CS yields weaker or no conditioning.

    • Timing of CS–US pairing

      • Conditioning is strongest when the CS precedes the US (precedes in time) rather than simultaneous or the US preceding the CS.

      • This temporal arrangement makes the CS a warning signal for the upcoming US rather than a mere coincident event.

    • Classical conditioning as a move beyond reflexes

    • The mind is not simply a passive receiver of reflexes; expectations and preparation play a role in conditioned responses.

    • Key examples and implications

    • Fear and anxiety conditioning: Repeated exposure to a car accident (US) paired with cues (e.g., squeaky brakes or the scene) can condition fear responses to those cues (CS).

    • Hospital shots and fear: Medical contexts where neutral cues (nurse’s words like 'don’t worry, this won’t hurt') become predictors of pain, leading to conditioned fear responses.

    • Music and emotion: A neutral musical cue followed by an aversive event can come to evoke anxious or fearful responses when heard later.

    • Everyday life marketing and media: Conditioning concepts explain how brands create emotional associations with products; the CS predicts US-related feelings.

    • The Little Albert study (Watson & Rayner)

    • Description: A famous, ethically controversial demonstration of classical conditioning in a child: a neutral stimulus (white rat) paired with a loud noise to induce fear.

    • Outcome: Albert developed fear responses to the rat and generalized fear to similar stimuli (e.g., a rabbit).

    • Ethical concerns: Highly problematic by modern standards; taught as a cautionary example of research ethics and the potential for abuse in behaviorist approaches.

    • Takeaway: Demonstrates how easily a fear response can be learned and generalized; highlights limitations of aggressive behaviorist methods.

    • Biological constraints on conditioning (Garcia & Koelling, 1966)

    • Taste aversion: Rats conditioned to avoid a taste after illness (US) but not conditioned to avoid a visual or auditory cue paired with illness.

    • Different pairing with different USs: Illness tends to pair with taste; shocks or pain pair with visual/auditory cues.

    • Implication: Not all stimuli can be equally conditioned with all USs; there are species-specific and stimulus-type predispositions that constrain learning.

    • Contingency and predictive value in conditioning

    • A CS must be informative about the likelihood of the US to produce conditioning.

    • Example design: In a baseline condition, a bell or no bell is followed by a shock with probability 0.4 in both cases (no contingency). When the bell is predictive (P(Shock|Bell) > P(Shock|NoBell)), conditioning occurs; the CS becomes a predictor of the US.

    • Summary: Conditioning is driven by the expectancy of the upcoming US, not merely by temporal contiguity.

    • Cognitive reinterpretation of conditioned responses

    • CS-producing the CR is not identical to simply mimicking UR; it is a preparatory reaction reflecting anticipation of the US.

    • Different CS–UR pairings can yield distinct conditioned responses (e.g., fear conditioning often results in heart-rate deceleration and freezing rather than the UR pattern).

    • Applications: Drug conditioning and tolerance

    • Conditioned compensatory responses: Drug use often occurs in specific contexts; cues associated with drug use (syringe, paraphernalia, room) become conditioned stimuli that provoke conditioned reactions opposite to the drug’s unconditioned effects (e.g., dysphoria, increased sensitivity to pain).

    • Consequences: This conditioning helps explain tolerance, craving, and increased overdose risk in unfamiliar environments lacking the conditioning cues.

  • Basic mechanisms that broaden beyond the lab: bridging to real-world scenarios

    • Conditioning and advertising: Conditioning principles are used to evoke positive feelings with certain brands; the conditioned stimulus (logo/jingle) pairs with unconditioned emotional responses (pleasure, warmth) via marketing stimuli.

    • Contingency and consumer expectations: Advertisers aim to create informative cues that reliably predict positive outcomes to strengthen consumer associations.

  • Transition to operant conditioning (learning is about actions and consequences)

    • Core idea: The relationship between a behavior and its consequences governs future likelihood of that behavior.

    • Reinforcement vs punishment

    • Reinforcement (increases probability of a response): positive reinforcement (adding something pleasant, e.g., food) or negative reinforcement (removing something unpleasant).

    • Punishment (decreases probability of a response): positive punishment (adding something unpleasant) or negative punishment (removing something pleasant).

    • The Law of Effect (Thorndike; popularized by Skinner)

    • Core concept: Behavior is governed by its consequences; responses followed by rewards tend to increase in frequency.

    • Classic demonstration: A hungry cat in a puzzle box must perform a behavior (pull a lever) to escape and obtain food. Across trials, the time to exit decreases, showing incremental learning without a clear insight jump.

    • B.F. Skinner and operant conditioning

    • Skinner boxes: Experimental chambers where an animal can perform behaviors (e.g., press a lever) to receive reinforcement (food) or avoid punishment.

    • Shaping and successive approximations: A process where gradually closer behaviors to the desired target are reinforced, shaping complex actions over time.

    • Primary vs secondary reinforcement

      • Primary reinforcement: Innate rewards (e.g., food, warmth).

      • Secondary reinforcement: Conditioned rewards (e.g., a tone or 'well done' that has acquired value via association with primary rewards).

    • Reinforcement schedules and their effects on behavior

    • Continuous reinforcement (CRF): Reward after every correct response; often rapid initial learning but quicker extinction when rewards stop.

    • Partial reinforcement schedules (PRF): Not rewarding every time, which often leads to more persistent responding after reinforcement stops.

      • Fixed ratio (FR): Reward after a fixed number of responses (e.g., FR-4 after four responses).

      • Variable ratio (VR): Reward after a variable number of responses around an average (e.g., VR-4; average four responses). This typically yields high, steady response rates and is highly resistant to extinction.

      • Fixed interval (FI): Reward after the first response following a fixed time interval (e.g., FI-4 minutes).

      • Variable interval (VI): Reward after the first response following a variable time interval (average around four minutes).

    • Consequences of reinforcement schedules

    • Variable ratio and variable interval schedules tend to produce higher and more consistent response rates compared to fixed ratio/interval schedules.

    • Real-world parallel: Gambling often relies on a variable ratio schedule, where rewards are unpredictable, leading to sustained gambling behavior.

    • Partial reinforcement extinction effect (PRE)

    • With partial reinforcement, extinction (cessation of reward) happens more slowly than with continuous reinforcement; the behavior persists longer once rewards stop.

    • Everyday example: Crying baby and parental response

      • If parents reinforce crying intermittently (partial reinforcement), crying tends to persist longer and is harder to extinguish than if the parents always responded immediately (continuous reinforcement).

      • Practical guidance: In parenting or therapy, using consistent reinforcement strategies is important; alternating strategies can unintentionally prolong problematic behaviors.

    • Cognitive perspectives and implications for learning

    • Latent learning and cognitive maps: Organisms can acquire knowledge even without immediate reinforcement. For example, rats exploring a maze develop a cognitive map that aids later navigation even if they were not being reinforced at the time.

    • These findings challenged strict behaviorist views, highlighting internal representations and planning.

    • Contingency and control in human development

    • Early studies show that a sense of control affects learning and motivation:

      • Babies who can control a mobile’s movement by their own actions show more positive affect and engagement than those who cannot.

      • Learned helplessness: When individuals experience lack of control over aversive events, they may develop passivity and depressive-like symptoms; later, when given the opportunity to control, some can relearn control but others persist in a passive state.

    • Learned helplessness (classic experiment by Seligman and colleagues): Dogs exposed to unavoidable shocks later fail to escape when possible, suggesting that perceived lack of control can create enduring motivational deficits and depressive-like states in both animals and humans.

    • Practical implications and ethical considerations

    • Behavioral shaping and reinforcement strategies are powerful tools in education, therapy, animal training, and behavior modification.

    • Ethical constraints: Historical experiments (e.g., Little Albert) illustrate the ethical concerns in psychological research; current standards emphasize minimizing harm, informed consent, and welfare.

  • Connections to foundational principles and real-world relevance

    • The lecture emphasizes the evolution from strict behaviorism toward a more nuanced view incorporating cognitive processes, expectations, and contingency information.

    • Practical relevance spans education (feedback schedules), marketing (conditioning of affective responses to brands), clinical psychology (fear conditioning, phobias), addiction science (cue-induced craving and relapse), and parenting strategies (reinforcement patterns).

  • Formulas and key notation (for quick reference)

    • Classical conditioning basics

    • US → UR

    • CS + US → UR → CR after acquisition

    • Extinction: CS alone reduces CR

    • Spontaneous recovery: after rest, CS elicits CR again

    • Contingency and information content

    • Conditioning is stronger when: P(US|CS) > P(US|
      eg CS)

    • Generalization and discrimination (qualitative concepts)

    • Generalization: CR extends to stimuli similar to CS

    • Discrimination: Narrowing of generalization to the exact CS through differential training

    • Conditioning timing and predictors

    • Conditioned stimulus should precede the US for effective conditioning

    • Reinforcement schedules (types and effects)

    • FR: reinforcement after n responses

    • VR: reinforcement after an average of n responses

    • FI: reinforcement after the first response after a fixed interval

    • VI: reinforcement after the first response after a variable interval

    • Contingency principle in Pavlovian conditioning

    • A CS must provide information about the likelihood of the US to produce a CR

    • Learned helplessness and control

    • Perceived control influences motivation and learning outcomes; lack of control can lead to depressive-like states

  • Summary of key takeaways

    • Habituation and sensitization are the simplest forms of nonassociative learning, shaping how we respond to repeated or salient stimuli.

    • Classical conditioning demonstrates how neutral cues can acquire predictive power to elicit reflexive responses; timing, contingency, generalization, discrimination, and higher-order conditioning influence the strength and scope of learned responses.

    • The Little Albert study and Garcia–Koelling experiments illustrate both the power and limits of conditioning, highlighting ethical concerns and biological constraints.

    • Operant conditioning emphasizes learning through reinforcement and punishment, with variable training schedules often producing more persistent behavior than continuous reinforcement.

    • Cognitive insights (latent learning, cognitive maps, contingency awareness) challenge a purely stimulus–response view, showing how expectations, control, and internal representations influence learning.

    • Real-world implications span education, marketing, health, addiction, and parenting, while acknowledging the role of context, cues, and control in shaping behavior.

  • Suggested readings and further exploration

    • Classic experiments: Pavlov’s conditioning, Skinner’s operant conditioning with Skinner boxes, Garcia & Koelling taste aversion studies, Seligman’s learned helplessness, Tolman’s latent learning and cognitive maps, Watson & Rayner’s Little Albert study (for ethical critique).

    • Topics to review in the textbook: observational learning (Bandura), higher-order conditioning, extinction and spontaneous recovery mechanisms, and modern cognitive theories on conditioning and expectation.