Learning EXAM 2 Feb 17th
Day 8 (Feb. 3rd) — Learning
Supplemental Class Notes
Operant Conditioning
Target behavior is followed by reinforcement or punishment to strengthen or weaken it. the learner is then more or less likely to exhibit the behavior in the future
the stimulus happens after the response
Positive (add something)
Positive Reinforcement — adding something pleasant to increase behavior
Positive Punishment — Adding something unpleasant to decrease behavior
Negative (remove something)
Negative Reinforcement — Taking away something unpleasant to increase behavior
Negative Punishment — Taking away something pleasant to decrease behavior
Three phases
Acquisition — learning phase where the response is established
Extinction — reduction/elimination of a response after the stimulus is presented enough times without reinforcement
Spontaneous recovery — sudden emergence of extinguished response
Discriminative Stimulus — stimulus that signals the presence of reinforcement (kissy noises to call a dog, then she gets excited because that means a treat might follow)
Is reinforcement or punishment better?
Punishment tells us what not to do, instead of what to do. Leads to anxiety, encourages hiding the behavior, can model aggressive behavior (hitting a child, the child thinks that physical aggression is okay) → Reinforcement is better, but punishment should be used sparingly
Shaping — train more complex behavior by reinforcing gradually closer versions of target behavior
reinforce responses that resemble the desired behavior, 2. reinforce the response that next closely resembles the desired behavior, 3. continue to reinforce closer and closer approximations of the desired behavior, 4. then only reinforce the desired behavior
Primary and Secondary Reinforcers
primary — innate, natural, “built-in” reinforcing qualities (water, food, sleep, sex, shelter, touch)
secondary — no inherent value and only has reinforcing qualities when linked with primary reinforcers (money → leads to buying food or shelter, praise, stickers)
Token economies (appropriate behavior, receives tokens in positive reinforcement; inappropriate behavior, loses tokens in negative reinforcement)
schedules of reinforcement — four types
continuous reinforcement — reinforces after every desired behavior
partial reinforcement — reinforces only some of the time, behavior is slower to become extinct
Fixed Intervals
Description — reinforcement delivered at predictable times (after 5, 10, 15, and 20 minutes)
Result — moderate response rate with significant pauses after reinforcement
Example — hospital patient uses patient-controlled, doctor-timed pain relief
Variable Intervals
Description — Reinforcement is delivered at random time intervals (5, 7, 11, and 20 minutes)
Result — moderate but steady response rate
Example — checking social media, reinforces with dopamine when you randomly check it
Fixed Ratio
Description — reinforcement is delivered after a predictable number of responses (after 2, 4, 6, and 8 responses)
Result — high response rate with pauses after reinforcement
Example — factory worker getting paid for every 5 items they make
Variable Ratio
Description — reinforcement is delivered after unpredictable number of responses (after 1, 4, 5, and 9 responses)
Result — high and steady response rate
Example — Gambling
Beyond Classical and Operant Conditioning
Latent learning — competence without performance. learning is taking place, but it isn’t demonstrated because there’s no motivation
Observational Learning — involves watching others, then imitating or modeling their behavior. those performing the behavior that is imitated are called models
Albert Bandura — proposed a brand of behaviorism called social learning theory
Attention, Retention, Reproduction, Motivation
If you saw the model reinforced for their behavior, you want to copy them through vicarious reinforcement
if you saw the model being punished, you would not want to copy them through vicarious punishment
Does watching violent media or playing violent video games cause aggression?
Watch Bandura’s Bobo Doll experiment
Day 7 (Jan. 29th) — Learning
Supplemental Class Notes
sea turtles follow instincts to go to the ocean, but we have to learn how to ride a bike
Associative learning → classical conditioning (Pavlov, Watson) and operative conditioning (Thorndike, Skinner)
Social Learning Theory → Observational learning (Bandura)
Habituation — simplest form of learning. the process of responding less strongly over time to repeated stimuli
life would become super overwhelming if we didn’t know how to block out certain stimuli (what needs a response?)
Classical Conditioning
associative learning — associating one thing with another. Can also be referred to at Pavlovian conditioning
Ivan Pavlov — initially studying dog digestion
Phase 1 Before Conditioning — ringing the bell (neutral stimulus) causes no response. then bringing out food (unconditioned stimulus) causes the dog to salivate (unconditioned response)
Phase 2 During Conditioning — the bell and food are repeatedly paired together (neutral and unconditioned stimulus) to get the dog to salivate (unconditioned response)
Phase 3 After Conditioning — the bell (now a conditioned stimulus) causes the dog to salivate (a conditioned response)
Terminology
unconditioned stimulus UCS — stimulus that elicits an automatic response
unconditioned response UCR — automatic response to a non-neutral stimulus
conditioned stimulus CS — previously neutral stimulus that now elicits a response from an association with UCS
conditioned response CR — response that was previously neutral stimulus because of a learned association
The “ex” effect
Acquisition — slowly learning the CR which is caused by the CS and UCS
Extinction — CR decreases and eventually disappears, from presenting the CS alone and without the UCS
Spontaneous Recovery — seemingly CR reappears after a rest period when presenting the CS again
Caramel almond milk latte —unconditioned stimulus
Stimulus discrimination — organism learns to respond to various stimuli that are similar
a dog discriminating sounds of bells that are close to, but not exactly, the sound of the bell with food
stimulus generalization — organism demonstrates the conditioned response to stimuli that are similar to the condition stimulus, opposite of stimulus discrimination
a dog salivating at sounds that are similar to the original bell, or a dog thinking a plastic bag with crackers is the same as their treat bag
John B. Watson — Founder of behaviorism
“Little Albert” study (very unethical) — fears can be conditioned using classical conditioning
Application of Classical Conditioning
Treatment of phobias through exposure therapy (systematic desensitization)
teach the patient relaxation techniques, 2. create a fear hierarchy of stressors that trigger a phobia from least to most scary, 3. slowly expose the patient to the feared stimulus while ensuring they’re relaxed at each level, End Goal — replace the fear (conditioned response) with relaxation (new conditioned response) when encountering the phobic stimulus (conditioned stimulus)
Reading Notes Chapter 6, Learning
What Is Learning?
reflexes — motor or neural reaction to a stimulus in the environment. simpler than instincts
knee-jerk reaction or moving your hand off a hot stove
more primitive areas of the central nervous system
instincts — behaviors triggered by a broader range of events, like maturing and change of seasons. both help an organism adapt to the environment and don’t have to be learned
typically involves the whole or larger parts of the body
learning — relatively permanent change in behavior or knowledge that results from experience
Associative learning — an organism makes connections between stimuli or events that occur in the environment
Classical Conditioning — Pavlovian conditioning
(associating events that repeatedly happen together)
Pavlov (1849-1936) performed research on dogs and is known for classical conditioning
interest of study was the digestive system, tried to measure the amount of saliva a dog produced in response to different foods. the dogs began to salivate after hearing a bell, the footsteps of the lab assistants, and an empty food bowl
an organism has an unconditioned response or a conditioned response
unconditioned stimulus (UCS) — stimulus that elicits a reflexive response in an organism
unconditioned response (UCR) — a natural reaction to a given stimulus
neutral stimulus (NS) — stimulus that doesn’t elicit a natural response
conditioned stimulus (CS) — stimulus that elicits a response after repeatedly being paired with an unconditioned stimulus
conditioned response (CR) — behavior caused by a conditioned stimulus. associated a stimulus with a reward/behavior, then responding in anticipation of the stimulus
Real World Application of Classical Conditioning
Ex. Woman throws up after receiving her first chemotherapy treatment, continues to vomit after every session. Now, after not being on chemo, she still gets nauseous when visiting the oncology office.
chemo drugs are the unconditioned stimulus (UCS), vomiting is the unconditioned response (UCR), doctor’s office is the conditioned stimulus (CS), and nausea is the conditioned response (CR)
higher-order conditioning or second-order conditioning — pairing a new neutral stimulus with the conditioned stimulus. it’s hard to condition any further than a second order
General Processes in Classical Conditioning
acquisition — when an organism learns to connect neutral stimulus and unconditioned stimulus
taste aversion — you get sick after eating a chicken dish. now every time someone orders that chicken dish, you feel nauseous
extinction — decrease in conditioned response when unconditioned stimulus is no longer presented with the conditioned stimulus
spontaneous recovery — the return of a previously lost conditioned response after a rest period
stimulus discrimination — organism learns to have different responses to similar stimuli
stimulus generalization — opposite of discrimination, grouping different stimulus to mean the one conditioned stimulus
Behaviorism
John Watson (early 1900s) believed that human behavior is the result of conditioned responses
Operant Conditioning
(learn to associate events — behavior and consequence or reinforcement or punishment)
behavior is motivated by the consequence we receive from the behavior
law of effect — behavior followed by satisfying consequences are more likely to be repeated and vice versa
ex. a job. we like getting paid, so we keep showing up
Skinner box
Reinforcement — increases a behavior
positive reinforcement — desirable stimulus added to increase behavior
clean your room → you get a toy
negative reinforcement — undesirable stimulus removed to increase a behavior
cars beeping until you fasten your seatbelt
Punishment
positive punishment — adding an undesirable stimulus to decrease a behavior
scold a student so they stop texting in class
negative punishment — remove a pleasant stimulus to decrease a behavior
take away the child’s toy so they stop misbehaving
a lot of punishment in the past was physical punishment — lot of drawbacks like fearing the person giving the punishment, kids become prone to the same aggressive and antisocial behavior
Shaping
reward suggestive approximations of the wanted behavior — people (animals) need to show the behavior first before they get a reward
commonly used by animal trainers today
Primary and Secondary Reinforcers
primary — innate reinforcing qualities
Water, food, sleep, shelter, sex, touch, affection, pleasure → things that organisms never lose the drive for
secondary — no inherent value
only reinforcing when linked with a primary reinforcer. Praise (secondary) is linked to affection (primary)
behavior management systems like sticker charts, tokens for rewards, etc.
Reinforcement Schedules
best way to teach/influence is use positive reinforcement
continuous reinforcement — receiving a reward/reinforcer every time you display a behavior
partial reinforcement — person or animal doesn’t get the reward every time they perform the behavior
fixed interval schedule — behavior is reinforced after a set amount of time
painkillers on an IV drip, one dose per hour
variable interval schedule — behavior is reinforced after various amounts of unpredictable time
fixed ratio reinforcement schedule — set number of responses before the behavior is rewarded
variable ratio reinforcement schedule — number of responses varies before the behavior is rewarded
gambling → you never know when you’ll win next
Cognition and Latent Learning
radical behaviorism — Skinner’s belief that cognition didn’t matter
cognitive map — mental picture of a space
latent learning — learning occurs but isn’t observable in behavior until there’s a reason to demonstrate it
Observational Learning (Modeling)
(process of watching others and imitating what they do)
models — individuals performing the behavior that will be imitated. likely involves a mirror neuron
several ways this happens
learning a new response — your coworker gets yelled at for being late, so you start getting there early so you won’t risk being late
watching someone suffer/benefit from a behavior — your dad gets burned on a hot stove, so you know not to touch the hot stove
general rule that can be applied to other situations
Steps in the Modeling Process
vicarious reinforcement — model is reinforced for their behavior, so you want to copy them
vicarious punishment — model is punished for behavior, so you avoid doing what they did
helps suggest why kids that were abused go on to become abusers themselves