Chapter 6: Learning

Unlearned Behaviors

Examples of Unlearned Behaviors:
- Birds build nests and migrate in winter.
- Infants suckle for nourishment.
- Dogs shake water off their wet fur.
- Salmon swim upstream to spawn (lay eggs).
- Spiders spin intricate webs.
These behaviors are called unlearned because they don't have to be taught.

Reflexes vs. Instincts

Reflexes:
- Simple reactions to stimuli (things happening around us).
- Involve specific body parts (like a knee-jerk reaction).
- Managed by basic parts of the nervous system (spinal cord, medulla).
Instincts:
- More complex behaviors triggered by events like maturation or seasons.
- Involve the entire organism (like migration).
- Managed by higher parts of the brain.

Learning

Definition of Learning:
- A long-lasting change in behavior due to experiences.
- Unlike reflexes and instincts, it requires time and practice.
- Example: Learning to surf takes practice, not just instinct.

Associative Learning

Definition:
- Making connections between things that happen together in our environment.
Involves three processes:
- Classical Conditioning:
- Learning through association (like jumping at thunder after seeing lightning).
- Operant Conditioning:
- Learning that happens based on rewards or punishments (like a dog learning to sit for treats).
- Observational Learning:
- Learning by watching and copying others (like Julian learning to surf by watching his dad).

Conclusion

All these types of learning fall under behaviorism, a branch of psychology that looks at how we learn. Different areas of psychology, like memory and thinking, also help us understand learning better.

Ivan Pavlov and Classical Conditioning

Who is Ivan Pavlov?
- A Russian scientist (1849–1936).
- Known for his work with dogs on classical conditioning.
What is Classical Conditioning?
- A process where we learn to connect two different things and expect events to happen together (like seeing food and feeling hungry).
Pavlov’s Discovery
- While researching dog digestion, he noticed dogs salivating not just to food but to other things like an empty bowl or footsteps.
- Salivation to the food is a natural reaction (unconditioned response) and doesn't require learning.

Key Terms in Conditioning

Unconditioned Stimulus (UCS): Something that naturally triggers a response (e.g., meat powder).
Unconditioned Response (UCR): The natural reaction to the UCS (e.g., salivation when eating).
Neutral Stimulus (NS): Something that does not naturally cause a reaction (e.g., a bell).
Conditioned Stimulus (CS): The neutral stimulus that now triggers a reaction after being paired with the UCS (e.g., the bell after association with food).
Conditioned Response (CR): The learned reaction to the CS (e.g., salivation at the sound of the bell).

How Classical Conditioning Works

The Experiment:
1. Pavlov rang a bell (NS) and then presented meat powder (UCS) to the dogs.
2. The dogs salivated (UCR) in response to the meat powder.
3. After repeated pairings of the bell and meat powder, the bell alone (CS) made the dogs salivate (CR).
Real World Examples of Conditioning:
- Moisha's Reaction: After receiving chemotherapy and vomiting, she felt sick every time she visited the doctor’s office.
- Here, the office is a CS and feeling nauseous is the CR due to the past experience.
- Tiger the Cat:
- Tiger learns to associate the sound of an electric can opener with getting food.
- If she hears the can opener, she gets excited and runs to eat.

Key Concepts in Conditioning

Higher-Order Conditioning:
- This occurs when you use a previously conditioned stimulus (like the can opener) to create an association with another neutral stimulus (like a squeaky cabinet).
Taste Aversion:
- If someone eats something and later feels sick, they might associate the food with sickness even if they were not related.
- An example is Harry getting sick after eating cotton candy, causing him to feel sick at the taste of sugar later.

General Processes in Conditioning

Acquisition:
- The initial learning phase where the neutral stimulus starts to create a conditioned response.
Extinction:
- This occurs when the connection between the CS and UCS is broken, and the conditioned response weakens.
Spontaneous Recovery:
- After some time, a previously lost conditioned response may return when the conditioned stimulus is presented again.

Discrimination and Generalization

Stimulus Discrimination:
- Learning to respond differently to various stimuli (like recognizing a specific bell sound).
Stimulus Generalization:
- Responding similarly to different but similar stimuli (e.g., a cat responding to sounds similar to the can opener).

Behaviorism

John B. Watson:
- The founder of behaviorism, which studies observable behavior and not internal thoughts or emotions.
- He believed that behavior can be conditioned and demonstrated this with his experiments on Little Albert, who learned to associate furry objects with fear.

Real-Life Applications of Classical Conditioning

Advertising:
- Advertisers use classical conditioning principles to associate products with attractive models or positive feelings to make you want to buy them.
- Example: Seeing a beautiful model with a car makes you think the car is better.

Tone (CS) → Salivation (CR)Tone (CS) → Salivation (CR)

Two illustrations are labeled “before conditioning” and show a dog salivating over a dish of food, and a dog not salivating while a bell is rung. An illustration labeled “during conditioning” shows a dog salivating over a bowl of food while a bell is rung. An illustration labeled “after conditioning” shows a dog salivating while a bell is rung.

Figure 6.4 Before conditioning, an unconditioned stimulus (food) produces an unconditioned response (salivation), and a neutral stimulus (bell) does not produce a response. During conditioning, the unconditioned stimulus (food) is presented repeatedly just after the presentation of the neutral stimulus (bell). After conditioning, the neutral stimulus alone produces a conditioned response (salivation), thus becoming a conditioned stimulus.

A diagram is labeled “Higher-Order / Second-Order Conditioning” and has three rows. The first row shows an electric can opener labeled “conditioned stimulus” followed by a plus sign and then a dish of food labeled “unconditioned stimulus,” followed by an equal sign and a picture of a salivating cat labeled “unconditioned response.” The second row shows a squeaky cabinet door labeled “second-order stimulus” followed by a plus sign and then an electric can opener labeled “conditioned stimulus,” followed by an equal sign and a picture of a salivating cat labeled “conditioned response.” The third row shows a squeaky cabinet door labeled “second-order stimulus” followed by an equal sign and a picture of a salivating cat labeled “conditioned response.”

Figure 6.5 In higher-order conditioning, an established conditioned stimulus is paired with a new neutral stimulus (the second-order stimulus), so that eventually the new stimulus also elicits the conditioned response, without the initial conditioned stimulus being presented.

Everyday Connection

Classical Conditioning at Stingray City

A chart has an x-axis labeled “time” and a y-axis labeled “strength of CR;” there are four columns of graphed data. The first column is labeled “acquisition (CS + UCS) and the line rises steeply from the bottom to the top. The second column is labeled “Extinction (CS alone)” and the line drops rapidly from the top to the bottom. The third column is labeled “Pause” and has no line. The fourth column has a line that begins midway and drops sharply to the bottom. At the point where the line begins, it is labeled “Spontaneous recovery of CR”; the halfway point on the line is labeled “Extinction (CS alone).”

Figure 6.7 This is the curve of acquisition, extinction, and spontaneous recovery. The rising curve shows the conditioned response quickly getting stronger through the repeated pairing of the conditioned stimulus and the unconditioned stimulus (acquisition). Then the curve decreases, which shows how the conditioned response weakens when only the conditioned stimulus is presented (extinction). After a break or pause from conditioning, the conditioned response reappears (spontaneous recovery).

Classical and Operant Conditioning Compared

	Classical Conditioning	Operant Conditioning
Conditioning approach	An unconditioned stimulus (such as food) is paired with a neutral stimulus (such as a bell). The neutral stimulus eventually becomes the conditioned stimulus, which brings about the conditioned response (salivation).	The target behavior is followed by reinforcement or punishment to either strengthen or weaken it, so that the learner is more likely to exhibit the desired behavior in the future.
Stimulus timing	The stimulus occurs immediately before the response.	The stimulus (either reinforcement or punishment) occurs soon after the response.

Table 6.1

Psychologist B. F. Skinner saw that classical conditioning is limited to existing behaviors that are reflexively elicited, and it doesn’t account for new behaviors such as riding a bike. He proposed a theory about how such behaviors come about. Skinner believed that behavior is motivated by the consequences we receive for the behavior: the reinforcements and punishments. His idea that learning is the result of consequences is based on the law of effect, which was first proposed by psychologist Edward Thorndike. According to the law of effect, behaviors that are followed by consequences that are satisfying to the organism are more likely to be repeated, and behaviors that are followed by unpleasant consequences are less likely to be repeated (Thorndike, 1911). Essentially, if an organism does something that brings about a desired result, the organism is more likely to do it again. If an organism does something that does not bring about a desired result, the organism is less likely to do it again. An example of the law of effect is in employment. One of the reasons (and often the main reason) we show up for work is because we get paid to do so. If we stop getting paid, we will likely stop showing up—even if we love our job.

Working with Thorndike’s law of effect as his foundation, Skinner began conducting scientific experiments on animals (mainly rats and pigeons) to determine how organisms learn through operant conditioning (Skinner, 1938). He placed these animals inside an operant conditioning chamber, which has come to be known as a “Skinner box” (Figure 6.10). A Skinner box contains a lever (for rats) or disk (for pigeons) that the animal can press or peck for a food reward via the dispenser. Speakers and lights can be associated with certain behaviors. A recorder counts the number of responses made by the animal.

A photograph shows B.F. Skinner. An illustration shows a rat in a Skinner box: a chamber with a speaker, lights, a lever, and a food dispenser.

Figure 6.10 (a) B. F. Skinner developed operant conditioning for systematic study of how behaviors are strengthened or weakened according to their consequences. (b) In a Skinner box, a rat presses a lever in an operant conditioning chamber to receive a food reward. (credit a: modification of work by "Silly rabbit"/Wikimedia Commons)

Positive and Negative Reinforcement and Punishment

	Reinforcement	Punishment
Positive	Something is added to increase the likelihood of a behavior.	Something is added to decrease the likelihood of a behavior.
Negative	Something is removed to increase the likelihood of a behavior.	Something is removed to decrease the likelihood of a behavior.

Reinforcement Schedules

Reinforcement Schedule	Description	Result	Example
Fixed interval	Reinforcement is delivered at predictable time intervals (e.g., after 5, 10, 15, and 20 minutes).	Moderate response rate with significant pauses after reinforcement	Hospital patient uses patient-controlled, doctor-timed pain relief
Variable interval	Reinforcement is delivered at unpredictable time intervals (e.g., after 5, 7, 10, and 20 minutes).	Moderate yet steady response rate	Checking social media
Fixed ratio	Reinforcement is delivered after a predictable number of responses (e.g., after 2, 4, 6, and 8 responses).	High response rate with pauses after reinforcement	Piecework—factory worker getting paid for every x number of items manufactured
Variable ratio	Reinforcement is delivered after an unpredictable number of responses (e.g., after 1, 4, 5, and 9 responses).	High and steady response rate	Gambling

A graph has an x-axis labeled “Time” and a y-axis labeled “Cumulative number of responses.” Two lines labeled “Variable Ratio” and “Fixed Ratio” have similar, steep slopes. The variable ratio line remains straight and is marked in random points where reinforcement occurs. The fixed ratio line has consistently spaced marks indicating where reinforcement has occurred, but after each reinforcement, there is a small drop in the line before it resumes its overall slope. Two lines labeled “Variable Interval” and “Fixed Interval” have similar slopes at roughly a 45-degree angle. The variable interval line remains straight and is marked in random points where reinforcement occurs. The fixed interval line has consistently spaced marks indicating where reinforcement has occurred, but after each reinforcement, there is a drop in the line.

Operant Conditioning

Definition:
- A type of associative learning where an organism learns to associate a behavior with its consequences.
- A pleasant consequence makes that behavior more likely to be repeated in the future.
Example:
- Spirit the dolphin flips in the air to get a fish from her trainer after hearing a whistle.

Comparing Classical and Operant Conditioning

Classical Conditioning:
- Focuses on how organisms learn to respond reflexively to new stimuli.
Operant Conditioning:
- Focuses on how consequences (rewards or punishments) influence behavior.

B.F. Skinner's Contributions

Key Idea:
- Behavior is motivated by its consequences.
Law of Effect:
- Behaviors followed by satisfying outcomes are likely to repeat, while those followed by unpleasant outcomes are less likely to repeat.

Skinner's Experiments

Skinner Box:
- A chamber used to study how animals learn through reinforcement.
- Animals can press levers or peck disks to receive food rewards.

Reinforcement and Punishment

Reinforcement:
- Increases the likelihood of a behavior.
- Can be Positive (adding a pleasant stimulus) or Negative (removing an unpleasant stimulus).
Punishment:
- Decreases the likelihood of a behavior.
- Can be Positive (adding an unpleasant stimulus) or Negative (removing a pleasant stimulus).

Shaping

Definition:
- A technique used to teach complex behaviors by rewarding successive steps towards the desired behavior.
Steps in Shaping:
1. Reward any behavior similar to the target behavior.
2. Gradually reward behaviors that are closer to the desired behavior until the actual target behavior is achieved.

Reinforcement Schedules

Definition:
- The timing and frequency of rewards given after a desired behavior occurs.
Types:
- Fixed Interval: Reward after a set time.
- Variable Interval: Reward at unpredictable times.
- Fixed Ratio: Reward after a set number of responses.
- Variable Ratio: Reward after an unpredictable number of responses (most resistant to extinction).

Everyday Applications

Behavior Modification:
- Used by parents and teachers to change children's behaviors.
- Sticker charts are a common tool.
Time-Out:
- A technique of negative punishment where a child is removed from a preferred activity to decrease misbehavior.
Cognitive Maps and Learning:
- Learning also involves understanding and internalizing information, like building a mental map of how to navigate through a familiar place.

Observational Learning

Definition:
- We learn by watching others and then imitating what they do or say.
- The people we copy are called models.
Mirror Neurons:
- A specific type of neuron involved in imitation.
- They help us understand and learn from what we see others do.

Examples of Observational Learning

Chimpanzee Experiment:
- Two groups of chimpanzees learned how to drink juice from a straw.
- First group did it the hard way (dipping).
- Second group did it the easier way (sucking directly).
- The first group changed their method after watching the second group to get more juice.
Claire’s Story:
- Claire punished her son Jay to correct his behavior.
- Later, her younger daughter Anna copied Claire’s punishment style with her teddy bear, showing how children pick up behaviors from their parents.

Bandura’s Social Learning Theory

Key Ideas:
- Learning involves more than just imitation; we think about what we see.
- Observational learning can include learning new responses, choosing whether to imitate, or figuring out general rules from what we observe.
Types of Models:
- Live Model: Someone shows a behavior in person (like a teacher showing a dance move).
- Verbal Model: Someone explains a behavior without doing it (like a coach giving instructions).
- Symbolic Model: Characters or people from TV or books demonstrating behaviors that can be copied (like superheroes in movies).

Steps in the Modeling Process

Attention: You need to focus on what the model is doing.
Retention: Remember what you observed.
Reproduction: You must be able to try the behavior yourself.
Motivation: Want to copy the behavior based on what happened to the model (e.g., being rewarded or punished).

Effects of Observational Learning

Prosocial Behavior:
- Observational learning can encourage positive behavior.
- Parents should model good behavior like reading or exercising.
Antisocial Behavior:
- Children may also model negative behavior, like aggression, especially if they see it frequently at home or in media.
- About 30% of child abuse victims grow up to become abusive themselves, often mimicking the behavior they experienced.

Media's Role

Violent Media Impact: Many studies indicate that watching violent TV shows and playing violent video games can lead to increased aggression in children.
Time spent exposed to violence may lead to people becoming numb to real violence and violent behavior.

Conclusion

Learning through observation is an important part of how we behave and interact with others.
It shows both the positive and negative aspects of what we learn from our surroundings.