Learning and Conditioning

Classical Conditioning

  • Definition: A learning process in which a previously neutral stimulus becomes associated with another stimulus through repeated pairing with that stimulus; also known as Pavlovian conditioning.

  • Key terms:

    • \text{UCS} (Unconditioned Stimulus): a stimulus that, without conditioning, will elicit a predictable response.

    • \text{UCR} (Unconditioned Response): a response that, without conditioning, results predictably from the UCS.

    • \text{CS} (Conditioned Stimulus): a stimulus that will elicit a predictable response because of its previous pairing with a UCS.

    • \text{CR} (Conditioned Response): a predictable response to a CS that has been learned through its pairing with a UCS.

  • Acquisition: the initial stage in classical conditioning; the phase during which a neutral stimulus becomes associated with a UCS and begins to elicit the CR.

  • Extinction: presenting the CS without following it by the UCS; extinction is not forgetting but the learning that the CS no longer predicts the UCS.

  • Spontaneous Recovery: after extinction, the CR can reappear when the CS is presented again following a rest period; this suggests the extinction association can be forgotten but not erased.

  • Generalization: a new CS similar to the original CS will elicit the CR to a degree proportional to similarity (e.g., similarities in pitch, color, shape).

  • Discrimination: the learned ability to distinguish between a CS and other stimuli that do not signal a UCS; exposure to different CS–UCS pairings tunes the learner to respond only to the reinforced CS.

  • Non-associative learning:

    • Habituation: reduced response probability to a non-changing, inconsequential stimulus.

    • Sensitization: increased response probability after a strong or salient stimulus.

  • Related concepts:

    • Associative learning: learning about the relationship between multiple stimuli (two stimuli, or a response and its consequences).

    • Non-associative learning vs associative learning: non-associative involves a single stimulus; associative involves relationships between stimuli.

  • Does classical conditioning apply to humans/animals? It does; examples include the classic Little Albert study (ethics later discussed).

  • Classic examples:

    • Pavlov’s dogs: neutral stimulus (light or bell) paired with food (UCS) leads to salivation (CR) in response to the CS alone.

    • Kramer Ambulance and tuna (example on slides): CS = can opener; UCS = tuna; UCR = coming running; CR = coming running to the CS after conditioning.

    • Little Albert: CS = white rat; UCS = loud sound; CR = fear/freaking out in response to the rat after conditioning; ethical concerns noted.

    • Mike and the Cocktail Shrimp: CS = cocktail shrimp; UCS = bad shrimp due to bacteria; CR = nauseous reaction; illustrates aversive conditioning.

  • Conceptual notes:

    • The conditioning process demonstrates how reflexive responses can be elicited by previously neutral stimuli when paired with biologically relevant stimuli.

    • CS can be any stimulus that comes to elicit a CR after conditioning (e.g., light, tone, object).

    • Applications include advertising (conditioning plates with brands), therapy (extinction of fear responses), and education (linking stimuli to positive outcomes).

  • Ethics and philosophy:

    • The “Twisted Story of Little Albert” is cited to discuss ethical concerns in early conditioning experiments (informed consent, potential distress, lasting harm).

    • Behaviorism as a framework emphasizes observable behavior and minimizes inner mental states; this view has evolved and is debated within modern psychology.

Operant Conditioning

  • Basic idea: operant conditioning involves behavior that operates on the environment to produce consequences; behaviors are strengthened if followed by a reinforcer and weakened if followed by a punishment.

  • Thorndike’s Law of Effect (historical foundation):

    • Quote (paraphrased): Of several responses in a given situation, those followed by satisfaction tend to recur; those followed by discomfort tend to be weakened; the greater the satisfaction or discomfort, the stronger the change in the bond between response and situation.

    • Source: Edward L. Thorndike (1911).

  • Skinner’s contribution: elaborated Thorndike’s Law of Effect and developed the behavioral technology of operant conditioning, including the Skinner box as a controlled environment for studying reinforcement and punishment.

  • The Skinner Box (operant chamber): typical components include

    • Pellet dispenser, lever, food cup, signal lights, dispenser tube, and an electrocution grid as needed for punishment experiments.

  • Core terms:

    • Reinforcer: any event that strengthens the behavior it follows.

    • Punisher: any event that weakens or decreases the likelihood of the behavior.

    • Shaping: an operant conditioning procedure in which reinforcers guide behavior toward closer and closer approximations of a desired goal (often used in animal training and human education).

  • Reinforcement and Punishment in detail:

    • Reinforcement increases the likelihood of the preceding behavior.

    • Punishment decreases the likelihood of the preceding behavior.

    • Positive reinforcement: presenting a desirable stimulus to increase a behavior.

    • Negative reinforcement: removing an aversive stimulus to increase a behavior.

    • Positive punishment: presenting an aversive stimulus to decrease a behavior.

    • Negative punishment: removing a desirable stimulus to decrease a behavior.

  • Notation and quick synthesis:

    • Positive Reinforcement (PR): add something good to increase a behavior.

    • Negative Reinforcement (NR): remove something aversive to increase a behavior.

    • Positive Punishment (PP): add something aversive to decrease a behavior.

    • Negative Punishment (NP): remove something desirable to decrease a behavior.

  • Schedule of reinforcement (how often reinforcement/punishment is delivered):

    • Continuous reinforcement: reinforce the desired response every time it occurs.

    • Partial (intermittent) reinforcement: reinforce the response only part of the time; leads to slower acquisition but greater resistance to extinction.

    • Ratio vs Interval schedules:

    • Ratio schedules depend on the number of responses.

      • Fixed Ratio (FR): reinforcement after a fixed number of responses; higher response rates with greater output (e.g., piecework pay).

      • Variable Ratio (VR): reinforcement after an unpredictable number of responses; high and steady response rates; very resistant to extinction (e.g., gambling, fishing).

    • Interval schedules depend on the time elapsed between reinforcements.

      • Fixed Interval (FI): reinforcement after a fixed amount of time; responses tend to increase as the time for reward nears (scalloped pattern).

      • Variable Interval (VI): reinforcement at unpredictable time intervals; produces slow, steady responding (e.g., occasional pop quizzes).

  • General implications of schedules:

    • Partial reinforcement tends to produce greater resistance to extinction than continuous reinforcement.

    • The type of schedule interacts with how quickly a behavior is learned and how robust it is to extinction.

  • Shaping and real-world applications:

    • Shaping is widely used in animal training and human behavior modification therapies; repeatedly reinforcing closer approximations to a target behavior.

  • Conditioned reinforcers:

    • Primary reinforcers: biologically based rewards (e.g., food, water).

    • Conditioned (secondary) reinforcers: learned through association with primary reinforcers (e.g., money, good grades, praise).

    • Conditioned reinforcers show how abstract behaviors can be learned through association with primary rewards.

  • Generalization and Discrimination in operant contexts:

    • Generalization: a behavior occurs in similar contexts or in response to similar cues that resemble the trained stimulus.

    • Discrimination: learning to respond differently to different, but similar, stimuli based on differential reinforcement.

  • Practical and ethical considerations:

    • Behavioral modification therapies and education use shaping, reinforcement schedules, and deliberate discrimination to influence behavior.

    • Historical notes include Skinner’s wartime Project Orcon (pigeon-guided missiles) and broader discussions about the role of behaviorist approaches in society (e.g., educational and social design, “Walden Two” discussions).

Deep Dive into Core Concepts and Examples

  • Non-associative vs associative learning:

    • Non-associative learning involves a single stimulus (habituation, sensitization).

    • Associative learning involves the relationship between stimuli and/or responses and their consequences (classical/operant conditioning).

  • Connections to foundational principles:

    • Classic conditioning demonstrates how reflexive responses can be elicited by neutral stimuli when paired with biologically relevant stimuli.

    • Operant conditioning extends learning to voluntary behaviors that affect the environment and become more or less likely based on consequences.

  • Real-world relevance and applications:

    • Advertising uses classical conditioning to pair brand cues (CS) with positive affect or desired outcomes (UCS) to elicit favorable responses (CR).

    • Education and workplace training use operant conditioning concepts (reinforcement schedules, shaping) to encourage desirable behaviors and discourage undesired ones.

    • Therapy uses behavioral principles to modify maladaptive behaviors (e.g., shaping new coping responses, using reinforcement schedules to promote desirable habits).

  • Ethical considerations:

    • Early conditioning experiments (e.g., Little Albert) raised concerns about informed consent and harm to participants.

    • The behaviorist emphasis on observable behavior has shaped debates about the role of internal mental states, autonomy, and freedom in shaping human behavior.

  • Key figures and their contributions:

    • Ivan Pavlov: classical conditioning; foundational concepts; methodology with digestive secretions in dogs.

    • John B. Watson: originator of behaviorism; advocated studying observable behavior and minimizing reference to internal mental states.

    • Edward Thorndike: Law of Effect; laid groundwork for operant conditioning and behavioral psychology.

    • B.F. Skinner: elaborated operant conditioning; developed the Skinner box; emphasized reinforcement/punishment and applied behavioral analysis.

  • Notable demonstrations and anecdotes:

    • Pavlov’s dog paradigm; response to CS after pairing with UCS (food).

    • Kramer Ambulance and tuna: CS (can opener) leads to CR (come running) after UCS (tuna).

    • Little Albert: CS (white rat) paired with loud noise (UCS) leading to CR (fear); highlighted ethical concerns.

    • Mike and cocktail shrimp: CS (cocktail shrimp) paired with aversive UCS (bad shrimp) leading to barfy response (CR).

    • Advertising case (Meow #3): exercise in applying classical conditioning to marketing strategies; analyze CS, UCS, UCR, CR in real ads.

  • Equations and concise formulas to remember:

    • Classical conditioning framework:

    • UCS → UCR

    • CS + UCS → CR

    • After conditioning: CS → CR

    • Extinction concept:

    • CS presented without UCS → CR diminishes over time

    • Spontaneous recovery:

    • After extinction and a rest period, CS → CR may reappear

    • Schedules of reinforcement (conceptual):

    • Continuous reinforcement: reinforcement after every correct response.

    • Partial reinforcement: reinforcement after some responses; slower acquisition but greater resistance to extinction.

    • Schedule definitions (summary):

    • Fixed Ratio (FR): after a fixed number of responses: reinforcement after every N responses.

    • Variable Ratio (VR): after an unpredictable number of responses: high, steady responding.

    • Fixed Interval (FI): after a fixed amount of time: response increases as time for reward nears.

    • Variable Interval (VI): after unpredictable time intervals: slow, steady responding.

  • Summary takeaways:

    • Classical conditioning is about linking stimuli to evoke automatic responses; acquisition occurs via repeated pairings; extinction and spontaneous recovery illustrate that learning can decay and reemerge.

    • Operant conditioning focuses on voluntary behavior shaped by consequences; reinforcement strengthens behavior, punishment weakens it; shaping guides behavior toward desired goals.

    • Schedules of reinforcement critically shape how quickly learning occurs and how resistant it is to extinction; partial reinforcement often yields more durable behavior than continuous reinforcement.

    • Real-world applications span education, parenting, advertising, therapy, and organizational behavior, but ethical considerations must govern experiments and interventions.

Quick Reference: Key Terms List

  • UCS: Unconditioned Stimulus

  • UCR: Unconditioned Response

  • CS: Conditioned Stimulus

  • CR: Conditioned Response

  • Acquisition, Extinction, Spontaneous Recovery, Generalization, Discrimination (Classical Conditioning)

  • Reinforcer, Punishment, Shaping (Operant Conditioning)

  • Positive/Negative Reinforcement, Positive/Negative Punishment

  • Primary vs Conditioned (Secondary) Reinforcers

  • FR, VR, FI, VI (Reinforcement Schedules)

  • Partial Reinforcement, Continuous Reinforcement

  • Skinner Box, Thorndike’s Puzzle Box

  • Measurable examples: Pavlovian conditioning, Little Albert, Mike and cocktail shrimp, Kramer Ambulance story, Meow #3 advertisement exercise

  • Ethical implications: early experiments vs modern standards; behaviorism’s stance on mental processes; real-world implications for manipulation and education