Learning and Behaviorism
Introduction to Learning
Human Learning: Unlike animals born with innate abilities (e.g., sea turtles knowing how to find the ocean and swim), humans must learn complex skills like swimming and surfing. Learning is a fundamental human ability.
What is Learning?
Unlearned Behaviors
Innate Behaviors: Organisms are born with these behaviors; they are unlearned and help organisms adapt to their environment.
Reflexes:
Simpler than instincts.
Involve the activity of specific body parts.
Involve primitive centers of the Central Nervous System (CNS), such as the spinal cord and medulla.
Definition: Motor or neural reactions to a specific stimulus.
Example: Human babies are born with a sucking reflex.
Instincts:
More complex than reflexes.
Involve the organism as a whole (e.g., sexual activity, bird migration), rather than just a specific body part.
Involve higher brain centers.
Definition: Behaviors triggered by a broader range of events, not merely a simple response to a specific stimulus.
Learned Behaviors
Definition of Learning: A relatively permanent change in behavior or knowledge that results from experience. It involves both conscious and unconscious processes.
Learning also helps organisms adapt to their environment, but unlike reflexes and instincts, learned behaviors require change and experience.
Associative Learning: Occurs when an organism makes connections between stimuli or events that occur together in the environment.
Example (Operant Conditioning): A dog learns to associate performing a trick with receiving a treat.
Behaviorism: A school of thought focused on observable behaviors and how they are learned.
Approaches to Learning within Behaviorism:
Classical Conditioning
Operant Conditioning
Observational Learning
Classical Conditioning
Definition: A process by which we learn to associate stimuli and, consequently, to anticipate events.
Example: Salivation (a response) can occur after two types of stimuli: food (a natural reaction) or a bell (a learned reaction).
Ivan Pavlov: Researched dogs' digestive systems and discovered that dogs were also learning to associate stimuli with food, leading to the study of classical conditioning.
Key Terms in Classical Conditioning
Before Conditioning:
Unconditioned Stimulus (UCS): A stimulus that naturally and automatically elicits a reflexive response.
Example: Food (naturally causes salivation).
Unconditioned Response (UCR): A natural, unlearned reaction to an unconditioned stimulus.
Example: Salivation (natural reaction to food).
During Conditioning:
Neutral Stimulus (NS): A stimulus that does not naturally elicit the response being studied.
Example: A bell sound (before conditioning, it does not naturally cause salivation).
The NS and UCS are paired repeatedly.
Process Example: Bell (NS) + Food (UCS)
ightarrow Salivation (UCR)
After Conditioning:
Conditioned Stimulus (CS): A previously neutral stimulus that, after being repeatedly paired with an unconditioned stimulus, comes to elicit a conditioned response.
Example: Bell (after being paired with food).
Conditioned Response (CR): The learned behavior/response caused by the conditioned stimulus.
Example: Salivation (learned reaction to the bell).
Process Example: Bell (CS)
ightarrow Salivation (CR)
Higher-Order Conditioning
Definition: Occurs when an already established conditioned stimulus is paired with a new neutral stimulus. This new stimulus (the second-order stimulus) eventually also elicits the conditioned response without the initial unconditioned stimulus being present.
Example:
A cat is initially conditioned to salivate (CR) when it hears an electric can opener (CS1) due to prior association with food (UCS).
A squeaky cabinet door (new NS / second-order stimulus) is then paired with the can opener (CS1).
Eventually, the cat salivates (CR) when it hears the squeaky cabinet door (CS2), even without the can opener or food being presented.
General Processes in Classical Conditioning
During Conditioning (a.k.a., Learning):
Acquisition: The initial period of learning when an organism acquires the association between a neutral stimulus (NS) and an unconditioned stimulus (UCS).
Typically requires multiple NS-UCS pairings with minimal time between stimuli.
Can occur with a single pairing and time intervals up to several hours in specific cases (e.g., taste aversion).
After Conditioning:
Extinction: The decrease in the conditioned response (CR) when the unconditioned stimulus (UCS) is no longer presented with the conditioned stimulus (CS).
Example: If food stops being presented with the sound of the bell, the dog will eventually stop salivating to the bell.
Spontaneous Recovery: The return of a previously extinguished conditioned response (CR) after a rest period, especially following longer rest periods.
The exact mechanisms behind spontaneous recovery are still under scientific investigation.
Distinguishing Between Stimuli
Organisms need to differentiate between various stimuli to respond appropriately.
Stimulus Discrimination: When an organism learns to respond differently to various stimuli that are similar but not identical.
Example: A dog learns to salivate only to a specific bell sound, not to a similar chime sound.
Stimulus Generalization: When an organism demonstrates the conditioned response (CR) to stimuli that are similar to the conditioned stimulus (CS).
The more similar the stimuli, the more likely generalization will occur.
Example: One bad experience with a spider leads to a fear of all spiders.
Habituation: A form of learning where an organism learns not to respond to a stimulus that is presented repeatedly without change.
Example: Getting used to the ticking sound of a clock in a room.
Behaviorism and John B. Watson
Founder: John B. Watson.
Focus: Studied human emotion using classical conditioning principles.
Belief: All behavior could be explained as a stimulus-response (S-R) reaction.
The Little Albert Study (infamous experiment):
Steps:
Albert was presented with various neutral stimuli (e.g., rabbit, dog, cotton wool, white rat) and showed no fear.
When Albert touched these stimuli (specifically the white rat), a loud, startling sound (UCS) was made.
Albert eventually learned to fear the stimulus alone (e.g., the white rat) without the loud sound, demonstrating classical conditioning of fear.
Link: https://www.youtube.com/watch?v=FMnhyGozLyE
Operant Conditioning
Two Notable Figures
Edward Thorndike:
Proposed the Law of Effect:
Behaviors followed by pleasant consequences or desired results are more likely to occur again.
Behaviors followed by unpleasant consequences or undesired results are less likely to occur again.
B.F. Skinner:
Proposed principles of operant conditioning based on Thorndike's Law of Effect.
His work with pigeons and rats demonstrated that organisms associate a behavior with its consequences (reinforcement or punishment).
Key Terms in Operant Conditioning
Consequence's Function:
Reinforcement: A consequence that increases the likelihood of a behavior occurring again in the future.
Punishment: A consequence that decreases the likelihood of a behavior occurring again in the future.
Consequence's Nature:
Positive: Means something is added following the behavior.
Negative: Means something is removed following the behavior.
Types of Operant Conditioning
Positive Reinforcement ( + R):
Mechanism: A desirable consequence is added following a behavior.
Effect: Makes the behavior more likely to occur in the future.
Example: A dog gets a treat for doing a trick (treat is added, behavior of doing trick increases).
Positive Punishment ( + P):
Mechanism: An undesirable consequence is added following a behavior.
Effect: Makes the behavior less likely to occur in the future.
Example: Getting a ticket for speeding (ticket is added, behavior of speeding decreases).
Negative Reinforcement ( - R):
Mechanism: An undesirable consequence is removed following a behavior.
Effect: Makes the behavior more likely to occur in the future.
Example: Fastening a seat belt silences the car's annoying beeping (beeping is removed, behavior of fastening seat belt increases).
Negative Punishment ( - P):
Mechanism: A desirable consequence is removed following a behavior.
Effect: Makes the behavior less likely to occur in the future.
Example: Losing phone privileges after sneaking out (phone privileges are removed, behavior of sneaking out decreases).
Classical vs. Operant Conditioning: Compared
Feature | Classical Conditioning | Operant Conditioning |
---|---|---|
Conditioning Approach | UCS + NS pairings lead to NS becoming a CS, which produces a CR. | A target behavior is followed by a consequence (reinforcement or punishment). Learner associates the behavior with a consequence. Behavior is more (R) or less (P) likely to continue. |
Learner's Role | Learner is passive; associates two stimuli. | Learner is active; associates behavior with its consequence. |
Behavior Type | Involuntary, reflexive responses. | Voluntary, deliberate behaviors. |
Stimulus Timing | The stimulus (CS) occurs immediately before the response (CR). | The stimulus (reinforcement or punishment) occurs soon after the response (behavior). |
The Skinner Box (Operant Conditioning Chamber)
Purpose: A specialized chamber used to study operant conditioning. It contains a lever or button that an animal (e.g., rat, pigeon) can manipulate.
Mechanism: Pressing the lever typically dispenses food, acting as a reward, allowing researchers to observe and quantify learned behavior.
Shaping
Definition: A tool used in operant conditioning to reward successive approximations of a target behavior. It's useful for teaching complex behaviors.
Process: The target behavior is broken down into many small, achievable steps.
Reinforce any response that even slightly resembles the desired behavior.
Then, reinforce responses that more closely resemble the desired behavior, while no longer reinforcing previously reinforced, less accurate responses.
Continue this process, gradually reinforcing closer and closer approximations.
Finally, only the desired, complete behavior is reinforced.
Common Use: Widely used by animal trainers (e.g., with pigeons).
Primary & Secondary Reinforcers
Primary Reinforcers:
Have innate reinforcing qualities; their reinforcement value is unlearned.
Examples: Food, water, sex, sleep, pleasure, removal of pain.
Secondary Reinforcers:
Have no inherent reinforcing value on their own.
Their reinforcement value is learned through association with a primary reinforcer.
Examples: Money, gold stars, praise.
Token Economy: A system where secondary reinforcers (tokens) are earned for desired behaviors and can later be exchanged for primary reinforcers or other desired items/privileges. Used in various settings like schools and prisons.
Reinforcement Schedules
Optimal Teaching Method: Positive reinforcement is generally the best way to teach a new behavior.
Continuous Reinforcement:
Mechanism: An organism receives a reinforcer each time it displays the desired behavior.
Effect: The quickest way to teach a new behavior.
Importance of Timing: The reinforcer must be presented immediately after the behavior for the association to be formed.
Downside: If reinforcement stops, the behavior quickly undergoes extinction.
Example: A dog receives a treat every time it sits on command.
Partial (Intermittent) Reinforcement:
Mechanism: The organism does not get reinforced every time it displays the desired behavior; reinforcement occurs intermittently.
Effect: Once a behavior is learned, partial reinforcement schedules tend to lead to more persistent behaviors that are resistant to extinction (compared to continuous reinforcement).
Factors for Partial Reinforcement Schedules:
Consistency (Fixed vs. Variable):
Fixed: The amount (either number of responses or time) between reinforcements is consistent and unchanging.
Variable: The amount (number of responses or time) between reinforcements is inconsistent and unpredictable.
Basis (Interval vs. Ratio):
Interval: The schedule is based on the time that has passed between reinforcements.
Ratio: The schedule is based on the number of responses made between reinforcements.
Types of Partial Reinforcement Schedules:
Fixed Interval (FI): Reinforcement is delivered at predictable time intervals.
Effect: Behavior tends to increase as the reinforcement time approaches, then drops immediately after.
Example: Patients taking pain relief medication at set times; checking mail at a specific time daily.
Variable Interval (VI): Reinforcement is delivered at unpredictable time intervals.
Effect: Produces a moderate, steady rate of response. Highly resistant to extinction.
Example: Checking social media for notifications; fishing (waiting for a bite).
Fixed Ratio (FR): Reinforcement is delivered after a predictable number of responses.
Effect: High response rates with a brief pause after reinforcement.
Example: Factory workers being paid for every x number of items manufactured; earning a bonus after making 10 sales.
Variable Ratio (VR): Reinforcement is delivered after an unpredictable number of responses.
Effect: Produces high and steady response rates without predictable pauses. Extremely resistant to extinction.
Example: Gambling (slot machines, lottery tickets); sales commissions (unpredictable number of calls until a sale).
Cognition and Latent Learning
Skinner's View: B.F. Skinner was a