Learning EXAM 2 Feb 17th

Day 8 (Feb. 3rd) — Learning

Supplemental Class Notes

Operant Conditioning
- Target behavior is followed by reinforcement or punishment to strengthen or weaken it. the learner is then more or less likely to exhibit the behavior in the future
  - the stimulus happens after the response
- Positive (add something)
  - Positive Reinforcement — adding something pleasant to increase behavior
  - Positive Punishment — Adding something unpleasant to decrease behavior
- Negative (remove something)
  - Negative Reinforcement — Taking away something unpleasant to increase behavior
  - Negative Punishment — Taking away something pleasant to decrease behavior
- Three phases
  - Acquisition — learning phase where the response is established
  - Extinction — reduction/elimination of a response after the stimulus is presented enough times without reinforcement
  - Spontaneous recovery — sudden emergence of extinguished response
- Discriminative Stimulus — stimulus that signals the presence of reinforcement (kissy noises to call a dog, then she gets excited because that means a treat might follow)
Is reinforcement or punishment better?
- Punishment tells us what not to do, instead of what to do. Leads to anxiety, encourages hiding the behavior, can model aggressive behavior (hitting a child, the child thinks that physical aggression is okay) → Reinforcement is better, but punishment should be used sparingly
Shaping — train more complex behavior by reinforcing gradually closer versions of target behavior
1. reinforce responses that resemble the desired behavior, 2. reinforce the response that next closely resembles the desired behavior, 3. continue to reinforce closer and closer approximations of the desired behavior, 4. then only reinforce the desired behavior
Primary and Secondary Reinforcers
- primary — innate, natural, “built-in” reinforcing qualities (water, food, sleep, sex, shelter, touch)
- secondary — no inherent value and only has reinforcing qualities when linked with primary reinforcers (money → leads to buying food or shelter, praise, stickers)
  - Token economies (appropriate behavior, receives tokens in positive reinforcement; inappropriate behavior, loses tokens in negative reinforcement)
- schedules of reinforcement — four types
  - continuous reinforcement — reinforces after every desired behavior
  - partial reinforcement — reinforces only some of the time, behavior is slower to become extinct
Fixed Intervals
- Description — reinforcement delivered at predictable times (after 5, 10, 15, and 20 minutes)
- Result — moderate response rate with significant pauses after reinforcement
- Example — hospital patient uses patient-controlled, doctor-timed pain relief
Variable Intervals
- Description — Reinforcement is delivered at random time intervals (5, 7, 11, and 20 minutes)
- Result — moderate but steady response rate
- Example — checking social media, reinforces with dopamine when you randomly check it
Fixed Ratio
- Description — reinforcement is delivered after a predictable number of responses (after 2, 4, 6, and 8 responses)
- Result — high response rate with pauses after reinforcement
- Example — factory worker getting paid for every 5 items they make
Variable Ratio
- Description — reinforcement is delivered after unpredictable number of responses (after 1, 4, 5, and 9 responses)
- Result — high and steady response rate
- Example — Gambling
Beyond Classical and Operant Conditioning
- Latent learning — competence without performance. learning is taking place, but it isn’t demonstrated because there’s no motivation
Observational Learning — involves watching others, then imitating or modeling their behavior. those performing the behavior that is imitated are called models
Albert Bandura — proposed a brand of behaviorism called social learning theory
- Attention, Retention, Reproduction, Motivation
  - If you saw the model reinforced for their behavior, you want to copy them through vicarious reinforcement
  - if you saw the model being punished, you would not want to copy them through vicarious punishment
- Does watching violent media or playing violent video games cause aggression?
Watch Bandura’s Bobo Doll experiment

Day 7 (Jan. 29th) — Learning

Supplemental Class Notes

sea turtles follow instincts to go to the ocean, but we have to learn how to ride a bike
Associative learning → classical conditioning (Pavlov, Watson) and operative conditioning (Thorndike, Skinner)
Social Learning Theory → Observational learning (Bandura)
Habituation — simplest form of learning. the process of responding less strongly over time to repeated stimuli
- life would become super overwhelming if we didn’t know how to block out certain stimuli (what needs a response?)
Classical Conditioning
- associative learning — associating one thing with another. Can also be referred to at Pavlovian conditioning
- Ivan Pavlov — initially studying dog digestion
  - Phase 1 Before Conditioning — ringing the bell (neutral stimulus) causes no response. then bringing out food (unconditioned stimulus) causes the dog to salivate (unconditioned response)
  - Phase 2 During Conditioning — the bell and food are repeatedly paired together (neutral and unconditioned stimulus) to get the dog to salivate (unconditioned response)
  - Phase 3 After Conditioning — the bell (now a conditioned stimulus) causes the dog to salivate (a conditioned response)
- Terminology
  - unconditioned stimulus UCS — stimulus that elicits an automatic response
  - unconditioned response UCR — automatic response to a non-neutral stimulus
  - conditioned stimulus CS — previously neutral stimulus that now elicits a response from an association with UCS
  - conditioned response CR — response that was previously neutral stimulus because of a learned association
- The “ex” effect
- Acquisition — slowly learning the CR which is caused by the CS and UCS
- Extinction — CR decreases and eventually disappears, from presenting the CS alone and without the UCS
- Spontaneous Recovery — seemingly CR reappears after a rest period when presenting the CS again
  - Caramel almond milk latte —unconditioned stimulus
- Stimulus discrimination — organism learns to respond to various stimuli that are similar
  - a dog discriminating sounds of bells that are close to, but not exactly, the sound of the bell with food
- stimulus generalization — organism demonstrates the conditioned response to stimuli that are similar to the condition stimulus, opposite of stimulus discrimination
  - a dog salivating at sounds that are similar to the original bell, or a dog thinking a plastic bag with crackers is the same as their treat bag
- John B. Watson — Founder of behaviorism
  - “Little Albert” study (very unethical) — fears can be conditioned using classical conditioning
- Application of Classical Conditioning
  - Treatment of phobias through exposure therapy (systematic desensitization)
    1. teach the patient relaxation techniques, 2. create a fear hierarchy of stressors that trigger a phobia from least to most scary, 3. slowly expose the patient to the feared stimulus while ensuring they’re relaxed at each level, End Goal — replace the fear (conditioned response) with relaxation (new conditioned response) when encountering the phobic stimulus (conditioned stimulus)

Reading Notes Chapter 6, Learning

What Is Learning?

reflexes — motor or neural reaction to a stimulus in the environment. simpler than instincts
- knee-jerk reaction or moving your hand off a hot stove
- more primitive areas of the central nervous system
instincts — behaviors triggered by a broader range of events, like maturing and change of seasons. both help an organism adapt to the environment and don’t have to be learned
- typically involves the whole or larger parts of the body
learning — relatively permanent change in behavior or knowledge that results from experience
Associative learning — an organism makes connections between stimuli or events that occur in the environment

Classical Conditioning — Pavlovian conditioning

(associating events that repeatedly happen together)

Pavlov (1849-1936) performed research on dogs and is known for classical conditioning
- interest of study was the digestive system, tried to measure the amount of saliva a dog produced in response to different foods. the dogs began to salivate after hearing a bell, the footsteps of the lab assistants, and an empty food bowl
an organism has an unconditioned response or a conditioned response
unconditioned stimulus (UCS) — stimulus that elicits a reflexive response in an organism
unconditioned response (UCR) — a natural reaction to a given stimulus
neutral stimulus (NS) — stimulus that doesn’t elicit a natural response
conditioned stimulus (CS) — stimulus that elicits a response after repeatedly being paired with an unconditioned stimulus
conditioned response (CR) — behavior caused by a conditioned stimulus. associated a stimulus with a reward/behavior, then responding in anticipation of the stimulus
Real World Application of Classical Conditioning
- Ex. Woman throws up after receiving her first chemotherapy treatment, continues to vomit after every session. Now, after not being on chemo, she still gets nauseous when visiting the oncology office.
- chemo drugs are the unconditioned stimulus (UCS), vomiting is the unconditioned response (UCR), doctor’s office is the conditioned stimulus (CS), and nausea is the conditioned response (CR)
- higher-order conditioning or second-order conditioning — pairing a new neutral stimulus with the conditioned stimulus. it’s hard to condition any further than a second order
General Processes in Classical Conditioning
- acquisition — when an organism learns to connect neutral stimulus and unconditioned stimulus
- taste aversion — you get sick after eating a chicken dish. now every time someone orders that chicken dish, you feel nauseous
- extinction — decrease in conditioned response when unconditioned stimulus is no longer presented with the conditioned stimulus
- spontaneous recovery — the return of a previously lost conditioned response after a rest period
- stimulus discrimination — organism learns to have different responses to similar stimuli
- stimulus generalization — opposite of discrimination, grouping different stimulus to mean the one conditioned stimulus
Behaviorism
- John Watson (early 1900s) believed that human behavior is the result of conditioned responses

Operant Conditioning

(learn to associate events — behavior and consequence or reinforcement or punishment)

behavior is motivated by the consequence we receive from the behavior
law of effect — behavior followed by satisfying consequences are more likely to be repeated and vice versa
- ex. a job. we like getting paid, so we keep showing up
- Skinner box
Reinforcement — increases a behavior
- positive reinforcement — desirable stimulus added to increase behavior
  - clean your room → you get a toy
- negative reinforcement — undesirable stimulus removed to increase a behavior
  - cars beeping until you fasten your seatbelt
Punishment
- positive punishment — adding an undesirable stimulus to decrease a behavior
  - scold a student so they stop texting in class
- negative punishment — remove a pleasant stimulus to decrease a behavior
  - take away the child’s toy so they stop misbehaving
- a lot of punishment in the past was physical punishment — lot of drawbacks like fearing the person giving the punishment, kids become prone to the same aggressive and antisocial behavior
- Shaping
  - reward suggestive approximations of the wanted behavior — people (animals) need to show the behavior first before they get a reward
  - commonly used by animal trainers today
Primary and Secondary Reinforcers
- primary — innate reinforcing qualities
  - Water, food, sleep, shelter, sex, touch, affection, pleasure → things that organisms never lose the drive for
- secondary — no inherent value
  - only reinforcing when linked with a primary reinforcer. Praise (secondary) is linked to affection (primary)
- behavior management systems like sticker charts, tokens for rewards, etc.
Reinforcement Schedules
- best way to teach/influence is use positive reinforcement
- continuous reinforcement — receiving a reward/reinforcer every time you display a behavior
- partial reinforcement — person or animal doesn’t get the reward every time they perform the behavior
- fixed interval schedule — behavior is reinforced after a set amount of time
  - painkillers on an IV drip, one dose per hour
- variable interval schedule — behavior is reinforced after various amounts of unpredictable time
- fixed ratio reinforcement schedule — set number of responses before the behavior is rewarded
- variable ratio reinforcement schedule — number of responses varies before the behavior is rewarded
  - gambling → you never know when you’ll win next
Cognition and Latent Learning
- radical behaviorism — Skinner’s belief that cognition didn’t matter
- cognitive map — mental picture of a space
- latent learning — learning occurs but isn’t observable in behavior until there’s a reason to demonstrate it

Observational Learning (Modeling)

(process of watching others and imitating what they do)

models — individuals performing the behavior that will be imitated. likely involves a mirror neuron
several ways this happens
1. learning a new response — your coworker gets yelled at for being late, so you start getting there early so you won’t risk being late
2. watching someone suffer/benefit from a behavior — your dad gets burned on a hot stove, so you know not to touch the hot stove
3. general rule that can be applied to other situations
Steps in the Modeling Process
- vicarious reinforcement — model is reinforced for their behavior, so you want to copy them
- vicarious punishment — model is punished for behavior, so you avoid doing what they did
- helps suggest why kids that were abused go on to become abusers themselves