EXAM 4 - chapters 1-13 on 5/16/2024
CHAPTER 1: Learning to Change
Charles Darwin
breeding - artificial selection
collection and study of animals
scarcity of resources as the population increases
species changed over time
survival based on having features with an “advantage”
during reproduction, 2 parents with an advantage will pass the feature to the offspring
predates work on inheritance
Natural selection
the process by which living organisms adapt to their environment over time using preferred traits being passed on through reproduction
requires variations within a species
environmental change can affect the natural selection process
climate change
predatory patterns
disease
applies also to behavior
Evolved Behavior: innate and adaptive forms of behavior
Reflexes: the relationship between a specific event and a simple response to that event
present in all members of a species
protection from injury
aid in food consumption
adaptive equipment of the animals
stereotypic (similar in form, frequency, strength, and time during development)
now always useful to a species and will die out over time
Modal action patterns: series of related acts found in all members of a species
genetic bias
little variability between members of a species
little variability across time
reliability elicited by a releaser (particular kind of event)
contribute to the survival of a species
protect individuals from the environment
more complex than reflexes
involve the entire organism, not just a few muscles or glands
long series of reflex-like acts
more variable than reflexes
not found in humans
General behavior traits: any general behavior tendency that is strongly influenced by genes
present in a variety of situations
does not require the presence of a releaser
less stereotypic / more variable than MAP
benefits to general behavior traits based on situation
example is being an easily angered person
Limits of Natural selection
slow process (occurs over generations)
previously valuable adaptations can become useless in a short period of time
not beneficial to the survival of the individual, but to survival of the species
mutations - abrupt changes in genes, may or may not be beneficial to person, unpredictable
hybridization - cross-breeding of closely related species, results in sterile offspring, takes one generation to see change
Learning
behavior and experience
behavior: anything that someone does that can measured
firing of a neuron to running a marathon
private events: thoughts and feelings
experience: changes in environment
learning: change in behavior due to change in environment
stimuli: physical changes in the environment
exclusions to learning: changes due to drugs, injury, aging, and disease
Habituation: reduction in tendency or probability of a response with repeated exposure to a stimulus
sensitization: an increase in the intensity or probability of a response within repreated exposure to a stimulus
Chapter 2: the study of learning and behavior
The natural science approach
four assumptions of natural phenomenon
all-natural phenomena are caused
the causes precede their effects
the causes of natural events include only natural phenomenon
the simplest explanation that fits the data is best
All natural phenomena are caused
things don’t “just happen”
determinism: the behavior of living organisms is based on cause and effects
the world is a lawful place
causes precede their effects
events cannot reach into the past to change behavior
experimentation: the act of controlling variables to determine the effect of one variable on phenomena
Natural causes of events include only natural phenomenon
cannot attribute natural events to acts of God, spirit, etc
empiricism: objective observation of phenomenon
the simplest explanation that fits the data is the best
parsimony: The simplest and most logical explanation is often the correct explanation and least contrived
fewest assumptions and extraneous variables
Measures of learning: measuring changes in behavior
how do we measure learning?
reduction in errors
changes in topography (form of behavior)
changes in intensity (force)
changes in speed (fast or slowness of behavior)
reduced latency (time between behaviors)
changes in rate (# of occurrences per unit of time)
increase in fluency (correct responses per unit of time)
combo of error and rate
Sources of data
Anecdotes: first or second-hand accounts, generally less specific. more qualitative, can provide leads. popular wisdom.
Case studies: provides more details than anecdotes.
lacks generalization (unique to the patient)
not representative of the entire group
takes a long time
cant determine cause/effect
self-reports (decrease validity)
Descriptive studies: review of group data
interviews
questionnaires
statistical analysis
can suggest but not test the hypothesis
Experimental studies
manipulation of one or more variables
contains a control
cause and effect
correlations
can be seen as artificial due to increased control
necessary to isolate the effects of IV
lab experiments provide better control
field experiments provide realistic approaches
Experimental components
Independent variable (IV) - manipulated with treatment
dependant variable (DV) - a variable that is measured
Experimental designs
Between subjects designs (group designs)
experimental group vs. control group
matched sampling
within-subject experiment (single subject design)
baseline
individuals as their own control
ABA reversal (treatment - withdrawal - treatment)
Animal research and human learning
PROS
control over heredity influences (breeding)
control over learning history (housed at birth)
APA guidelines for handling animals for research
CONS
generalization across species
practical vs theoretical value
animal rights
CHAPTER 3: Pavlovian Conditioning aka Classical Conditioning
Ivan Pavlov: physiologist (circulatory and digestive system)
shift to psychology
documenting reflexes (salvation) to change in environment (presentation of stimulus)
Reflexes
unconditioned:
inborn
same for all members of a species
permanent
Conditioned
not present at birth
acquired through experiences
change over time
unique to individual
Unconditioned reflex
unconditional stimulus (US) —> unconditioned response (UR)
Meat powder —> salivation
typically IMPORTANT to survival
Conditioned reflex
conditional stimulus (CS) —> conditional response (CR)
food dish —> salivation
How does a neutral stimulus become a conditioned stimulus
pairing: a process by which conditional stimulus regularly precedes an unconditional stimulus
conditional stimulus (CS) —> unconditional stimulus —> unconditional response
clap — meat — salivation
Pairing: after several trials, this chain becomes…
clap — salivation
Pavlovian conditioning - 2 key features
the behavior involves a reflex response
the conditional stimulus and unconditional stimulus pairing occurred regardless of what the individual response is
Pavlovs dogs
step one: unconditional stimulus — unconditional response
meat powder — salivation
step two: pair unconditional stimulus and neutral stimulus —> to get an unconditional response
step three: neutral stimulus = conditioned stimulus, after several trials the metronome (neutral) is a conditioned stimulus
step four: Conditioned stimulus leads to conditioned response
metronome —> salivation
Everyday examples of Classical conditioning
advertising
unconditional stimulus —> unconditional response
Colin Kaepernick —> inspiration
US (Kaepernick) + NS (Nike) = UR (Inspiration)
CS (NIKE) = CR (Inspiration)
pain
US (electric shock) —> UR (pain)
US (electric shock) + NS (cake) —> pain UR
CS (cupcake) —> pain CR
fear
US (drowning) — UR (fear)
US (drowning) + NS (rowboat) —> UR (fear)
CS (rowboat) = CR FEAR
Higher Order Conditioning
pairing a neutral stimulus with an established conditioned stimulus
Classical Conditioning
1 US is paired with 1 NS = CS
1 NS can be interchangeable with other neutral stimuli
only 1 NS can be presented at a time with the US
Higher order conditioning
established CS is paired with a new NS = CS
no need to pair a new neutral stimulus with an unconditional stimulus
multiple new neutral stimuli can be paired with an established CS to elicit a conditioned response
Examples of higher-order conditioning
US (light) —> UR (blink)
US (light) + NS (tap head) —> UR (blink)
CS (tap head) —> CR (blink)
higher order begins here — CS2 (SNAP) + CS1 (tap head) — blink
CS (snap) —> CR (blink)
We’ve added a second conditioned stimulus which is paired with the established CS of tap head.
Measuring Pavlovian Learning
Latency: the time between presentation of CS and CR
Test trials: intermittently present CS alone (no US), do you still get the CR? (an example is presenting “tap head” alone).
Intensity: the more established the CS —> CR the stronger the CR
Pseudoconditioning
occurs when an NS closely follows or is presented at the same time as a US, creating a perceived elicited CR.
test trial measures can determine if there is classical conditioning or pseudoconditioning present.
example: a rat is presented with various loud noises WHILE being presented with food and it may salivate at noise alone.
Variables affecting Pavlovian conditioning
pairing stimuli - how CS and US are paired
trace conditioning: CS begins and ends before US. There is a gap between CS and US
delay conditioning - CS is still present when US is introduced, OVERLAP.
simultaneous conditioning - CS is presented at the same time as US. NO time difference.
backward conditioning - CS is presented after US. generally ineffective.
Contingency - one event is dependant on another event. If x, then y.
typically the more certain or often the CS is presented with the US, the stronger the pairing but now always.
Contiguity - closeness in time between 2 events
typically the closer in time the CS is presented to the US, the stronger the pairing, but not always.
interstimulus interval: ISI. the interval of time between CS and US
Compound features: the presentation of 2 or more CS at the same time. Will there be a greater effect with the presentation of both CS?
overshadowing: when 2 stimuli are presented at the same time. one will produce a more effective CR due to the intensity. intense stimuli will typically overshadow the weaker stimuli.
overall, the more intense CS produces more reliable CR
however, intense stimuli can interfere with learning
Previous experience
latent inhibition: the appearance of NS without US interferes with the ability of NS to become a future CR. prior experience may undermine a new contingency. a new stimuli is more likely to become a CS.
Blocking: one stimulus effects the ability of another stimulus from becoming a CS due to prior experience
sensory preconditioning: when two NS are often found together prior to the pairing of one NS with US. Once one of the NS is paired with the US, the other NS will becomes a CS more easily
Timing: the more often the CS and US are paired together, the stronger the CR
with successive trials, the earlier CS-US pairings create a greater impact on CC
intertrial interval: the time between each CS-US pairing trial
longer intervals are more effective than shorter
shorter ISI more effective than longer
Extinction of Conditional Responses
extinction: a process by which a conditioned stimulus (CS) is repeatedly presented without the unconditioned stimulus (US) weakening the conditional response (CR).
CS (metronome) + US (food) = CR (salivation)
CS without US of food = CR
Extinction is NOT forgetting
Extinction is decreased performance due to lack of pairing of two stimuli
forgetting is decreased performance due to lack of practice
Extinction is Learning
the pairing of a stimulus with the absence of a previously paired stimulus
learning NOT to do something
takes practice (repeated trials)
can experience spontaneous recovery
re-emergence of a previously extinguished conditioned response
Theories of Conditioning: why does CC work?
Stimulus Substitution Theory (Ivan Pavlov)
suggests that CR = UR (conditioned stimulus = unconditioned stimulus)
the neurological connection between US + UR and the neurological connection between the CS and CR are the SAME
US and UR have innate neurological connections
CS and CR have acquired neuro connections through learning
CS serves as a substitute for the US to elicit a reflex
1973 Jenkins and More: pigeons pecked at lighted key after food was paired with light
1A. CONCERNS WITH STIMULUS SUBSTITUTION THEORY
CR does not equal UR
CR is weaker than UR
CR is less reliable UR
can not explain blocking/latent inhibition
CR can be the opposite of UR
Preparatory Response Theory: Gregory Kimble (1967)
UR is an innate response to deal with a US
CR is an acquired response to prepare for the US
a common explanation for drug tolerance
CS (environmental cue) —> CR (preparatory state to create homeostasis with the introduction of drugs)
user will need to increase the amount of drugs they take to get the same effect because the body prepares itself to self regulate with the addition of EXPECTED drug use.
Compensatory Response Theory: Siegal 1972
CR prepares the animal for US by compensating its effects
Common explanation for a fatal drug overdose with frequent users
CS (environmental cues) —> CR (body compensates the intro of drug)
when environmental cues are NOT present the body is not ready for the usual drug intake, which can lead to OVERDOSE
Example: drinking in a familiar bar vs an unfamiliar bar.
Recorla-Wagner model - 1972
There is a limit to pairing in CC
contributing factors - nature of US, # of CS-US pairing trials, limit to the CR
each successive CS-US pairing yields less learning
the greatest percentage of learning occurs in 1st trial
this model ACCOUNTS FOR BLOCKING
one CS will “use-up” more learning, leaving less available for the second CS.
CHAPTER 4: Applications / Examples of Classical Conditioning
Fear-based
unconditional stimulus (dog bite) —> unconditioned response (fear)
neutral stimulus (dog) + unconditional stimulus (dog bite) —> fear
dog (conditioned stimulus) —> fear (conditioned response)
Drug addiction
unconditioned stim (fentanyl) —> unconditioned response (euphoria)
neutral stimulus (syringe, room) + unconditioned stimulus (fentanyl) —> unconditioned response (euphoria)
conditioned stimulus (syringe, room) —> conditioned stimulus (euphoria)
Advertising
unconditioned stimulus (gym) —> unconditioned response (health, good body)
neutral stimulus (the rock) + unconditioned stimulus (gym) —> unconditioned response (health, good body)
CS (the rock) —> CR (health, good body)
Taste Aversion
unconditioned stimulus (maggots) —> unconditioned response (nausea)
neutral stimulus (reese’s cup) + unconditioned stimulus (maggots) —> unconditioned response (nausea)
CS (Reese’s cup) —> CR (nausea)
Chapter 5: Operant Learning and Reinforcement
Edward Lee Thorndike
studied animal learning
presented the same problem numerous times to see if performance improved
known for his puzzle box experiment with cats
Cats would try numerous inefficient maneuvers to escape an enclosed box to reach food
To open the box there would be a simple mechanism of pulling a loop or stepping on a treadle
each successive trial shorter in duration
two consequences: satisfying state affairs and annoying state of affairs
Law of Effect: behavior is a function of its consequence
the relationship between behavior and its consequence
four key elements
behavior
environment
change in behavior due to its environment
change in environment due to behavior
B.F Skinner
studied animal learning
known for his Skinner box experiment with rats
learned that rats increasingly pressed the lever for food
behavior operates on the environment
operant conditioning
behavior is strengthened or weakened by its consequences
TYPES OF OPERANT LEARNING
Strengthening Behavior - reinforcement
increase in the strength of behavior
behavior is more likely to occur in the future
positive reinforcement - stimulus is added, behavior increases in the future
negative reinforcement - stimulus is removed, behavior increases in the future
positive reinforcement
reward learning: adding a preferred stimulus that will increase the occurrence of behavior in the future
positive reinforcing: a stimulus that is preferred by an individual that increases the likelihood of a behavior occurring in the future, individualized to each person
negative reinforcement
escape avoidance learning: removing a non-preferred stimulus that will increase the occurrence of behavior in the future
negative reinforcer: a stimulus that an individual would typically avoid or try to escape, which the removal of will increase the likelihood of a behavior occurring in the future
Kinds of reinforcers
primary reinforcers: innately effective / no learning history required. examples are food, water, sex, sleep, shelter, social, love, control
satiation: reduction in reinforcing effects of a given reinforcer due to increased availability aka lack of need
food is not reinforcing when I am full,
deprivation: increase in the reinforcing effects of a given reinforcer due to decreased availability aka increased need
food is reinforcing if i am hungry
secondary reinforcers: conditioned reinforcers. learning through experiences (pairing with other reinforcers). weaker than primary reinforcers and satiate slower than primary reinforcers. effectiveness relies on primary reinforcers (praise, preferred items/activities).
generalized reinforcers: paired with many different reinforcers. can be used in a wide variety of situations
money, token boards
natural reinforcers: automatic reinforcers. spontaneously follow behavior
a jacket when I am cold
contrived reinforcers: manipulated by someone for the purpose of modifying behavior
a sticker for completing homework
Variables Affecting Operant Learning
contingency - correlation between behavior and consequence - If X then Y
to receive a reinforcer, one must do the behavior
is the reinforcer worth the behavior?
contiguity - the time between the behavior and the reinforcing consequences
more immediate reinforcement, faster learning curve
reinforcer characteristics
magnitude
frequency
quality - is it worth my time
behavior characteristics
magnitude
frequency
quality - is it worth my time
behavior characteristics
difficulty
biological makeup relative to the task
Motivating operations: changes the effectiveness of a reinforcer and the behavior reinforced by that given reinforcer at the moment
establishing operations - increase the value of a reinforcer, increase the frequency of behavior to get it
abolishing operations - decrease the value of a reinforcer, decrease the frequency of behavior to get it
Neuromechanics of Reinforcement
Olds and Milner - electrode connected to rats’ brains in the septal region, free to move but rats pushed a lever to deliver an electric shock
pushed lever constantly did not eat, mate or anything
electric stimulation was very reinforcing - wireheads
brain’s reward center - located in the septal region of the brain between 2 hemispheres
dopamine production: neurotransmitter responsible for a natural “high”. Good experiences naturally produce dopamine. triggers found outside the body
amounts of dopamine vary due to different situations/substances
unexpected events produce more dopamine than expected ones (Rescorla Wagner model)
dopamine is converted to epinephrine (adrenaline)
Theories of Positive Reinforcement
drive reduction theory (Hull) - all behaviors are due to motivational states called drives (think MOs). works well to explain primary reinforcers
a reduction in physiological needs (drive: hunger —> reinforcer: food)
does NOT explain secondary reinforcers (Hull expressed this using associations between primary and secondary reinforcers)
relative value theory (premark) - all behavior has relative values. there are NOT physiological needs.
no concern regarding primary vs secondary reinforcers
some relative values are greater than others (rats getting electrical stimulation vs food)
the more probable behavior will reinforce a less probable behavior
a given behavior can be more or less probable given circumstances
Premack principle: present less desired behavior and then preferred behavior
Response deprivation theory (William Timberlake and James Allison) - behavior becomes reinforcing when it is available less than normal
Theories of avoidance
two-process theory: includes classical conditioning and operant conditioning
classical conditioning: CS signals need to escape
Operant conditioning: avoidance
if the CS is no longer aversive, avoidance behavior persists
avoidance behavior does not extinguish with weakening of CS
Sidman avoidance behavior - use of regular time intervals and no conditioned stimulus
there was no signal of aversive stimulus to avoid, however, on a time schedule, rats still engaged in avoidance
Douglas Anger - time is a conditioned stimulus
Herrnstein and Hinclic - time is NOT a conditioned stimulus - average time will not get consistent behavioral outcomes
One process theory - only operant learning
escape and avoidance behavior is reinforced by the reduction of aversive stimulation
reduction in exposure to aversive stimuli is reinforcing
two process theorists that something not occurring cannot be reinforcing
To have extinction of avoidance behavior, you have to block the occurrence of avoidance and teach that aversive is no longer present
Chapter 6 - Reinforcement - Beyond Habit
Teaching New Behaviors - you cannot reinforce a behavior that does not occur
Shaping - reinforcement of successive approximations of a desired behavior
Reinforcement of a close enough behavior as you move towards the desired behavior
With increased trials, reinforce ALL improvements on the display of desired behavior
As reinforcement is provided for each approximation, only reinforce behavior that is equal to or better than prior approximations
Shaping occurs naturally in both humans and animals
Example of Shaping Behavior
Child has to say “ball” - reinforce the child saying the “b” sound first. Then, when they combine “b” and “a”, and finally “ll”.
Chaining - teaching a new skill by breaking down a complex task into smaller components, and teaching each component in successive order
Behavior chain - connected sequence of behavior
Task analysis - breaking down a task into its component elements
Each step reinforcers the following step
Types of Chaining
Forward chaining: reinforcing each step of the chain in order starting from the first step
Backward chaining: reinforcing each step of the chain in order starting from the last step.
Example of Forward Chaining
Video of a girl writing her name. You ask the child to first write the first letter as the teacher finishes it and reward the child. Repeat until they reach the end of their name, reinforcing each step.
Example of Backward Chaining
In a child washing their hands, first you guide the child's hands through all steps except the last one, which they must perform themselves. Upon completing the last step you give them a reinforcer (candy). You then do the same with the second to last, and then third to last step in the chain.
Chapter 7 - Schedules of Reinforcement
Schedules of Reinforcement - a certain rule describing contingency between a behavior and reinforcement
The relationship between a desired behavior and the reinforcement (reward) received.
Schedule Effects - the distinctive rate and pattern of behavior associated with a particular reinforcement schedule
How much or how often we engage in a particular behavior is determined by how we receive reinforcement
Continuous reinforcement - each time you engage in the behavior, you receive reinforcement
The simplest schedule, known as FR1
Leads to rapid responding
Not always practical/realistic
Intermittent schedules of reinforcement - reinforcement is received on some occasions, but not on every display of behavior
Fixed Ratio - a behavior is reinforced when it has occurred a fixed number of times (performance, with pauses after reinforcement)
Pause duration increases as ratio increases
Example: every basket scored within the paint is 2 points
FR3 - a behavior is reinforced every 3 presses
Variable Ratio - a behavior is reinforced when it has occurred an average number of times (steady rates of responding)
Pauses are less frequent than in fixed ratio
Example: casino slot machine - every so often you may win, but you don’t know when.
FR 5 - May be reinforced after 10 presses, but on average, you will be rewarded for every 5.
RATIO - number of times
Fixed Interval - a behavior is reinforced when it has occurred after a fixed duration of time has elapsed (correct response must occur)
Scallop shaped reasoning - pauses after reinforcement - why would you work at the 20 second mark if you are only reinforced after the 60 second mark?
Behavior increases in frequency closer to the interval that reinforcement is delivered
FI 5 second - for the next five seconds, a bird peck does not produce food. Only until the 5 second mark will food be delivered.
Baking a cake is an example. You must leave the cake in for 30 minutes. You may not start checking it until the 25 min mark.
Checking your watch when it gets closer to the end of class
Variable Interval - a behavior is reinforced when it has occurred after an average duration of time has elapsed
High, steady rates of responding compared to FI
VI 5 schedule - average interval between reinforced pecks is 5 seconds
Keeps people on their toes
Checking to see when someone has liked your picture. You do not know when it will happen.
Hunting a deer: sometimes the deer appears within seconds, sometimes you must wait hours
INTERVAL - duration of time
Extinction
The process by which behavior occurs but is not reinforced
Extinction Burst: sudden increase in the rate of behavior during the early stages of extinction. Followed by a steady decline. An example is telling a parent to simply ignore the child crying for the ipad, causing the child to scream louder. But the more you ignore the more they will stop,
Spontaneous Recovery - sudden reappearance of a behavior following extinction.
Resurgence - reappearance of a behavior that was previously reinforced during the extinction process.
Typically occurs when a replacement behavior (Y) is put on extinction and initial behavior (X) reappears in effort to make contact with reinforcement.
Continuous Time-Based Simple Schedules
Fixed Duration Schedule - a behavior is reinforced when it has continuously occurred for a fixed duration of time
Example: playing a sport for two hours and then having a snack
A child practices piano for a ½ hour and then receives reinforcement of milk and cookies given they practiced for the entire time.
Variable Duration Schedule - a behavior is reinforced when it has continuously occurred for an average duration of time
Example: when practicing a sport, you get a 5 minute water break on average every ½ hour.
A child practices piano - any session might end after 30, 45, 50 minutes, but on average, after ½ hour of practice they receive cookies and milk.
Concerns with Continuous Time-Based Schedule
How do you define and measure the continuous occurrence of the desired behavior
Does the reinforcer increase behavior outcomes? Does giving them a snack make them like piano / keep up with practice?
Does the availability of reinforcement match the effort to engage in desired behavior
Most often used
In conjunction with the Premack Principle - if eating cookies and drinking milk are reinforcing, and if this behavior is contingent on practicing the piano for some time, then playing piano should become reinforcing
When behavior leads to “natural reinforcement” - aka practice makes perfect logic
Noncontingent Time-Based Schedules - schedules of reinforcement INDEPENDENT of behavior
Fixed Time Schedule - a reinforcer is delivered after a given time regardless of behavior
Does not occur naturally
Used to create a state of satiation and reduce desire for a particular reinforcer
E.g - when a child is stealing candy from Halloween, you schedule 2 pieces of candy of their choice after dinner.
A pigeon receives food on an FT 10 schedule EVERY 10 SECONDS regardless of disc pecks or not.
Variable Time Schedule - a reinforcer is delivered at irregular time intervals regardless of behavior
Does not occur naturally
Used to create a state of satiation and reduce desire for a particular reinforcer
An aunt gives you money every time you see them, which is variable.
Checking on your child periodically while cooking so they do not come in harm's way
TIME SCHEDULES - depending on time
Fixed time - after 10 minutes
Variable time - irregular points in time
Progressive Schedules - systematically changing contingencies that describe the availability of reinforcer
Progressive schedules - can be applied to all simple schedules of reinforcement
Contingencies change with each trial
Amount of food might become smaller, the requirements for food might become larger, etc
Considerations - Progressive Schedules
Break Point - while using a progressive schedule, this is when a desired behavior dramatically stops or declines
Ratio Strain - stretching the ratio/interval of reinforcement too thin or too quickly. The demands are too strenuous. Example is workers who are overworked and underpaid.
Compound Schedules of Reinforcement
Multiple Schedules - one behavior, 2 or more simple schedules, with a known stimulus.
Example: a pigeon that has learned to peck a disc for grain may be put on a FR10 second schedule when a red light is on but on a VR10 schedule when a yellow light is on. Changes indicated by color change.
Mixed Schedules - one behavior, 2 or more simple schedules, no stimulus
Example: same pigeon example, except there is no light therefore the subject does not know when change is occuring.
Chain Schedule: multiple simple schedules ran consecutively.
Must complete all schedules in order
Reinforcer delivered after ALL schedules are completed.
Known stimulus to signal transition to diff schedule.
Example: Pigeon may be placed on a FR 10 FI 15 sec VR 20 schedule, with changing lights signaling each change. The pigeon must complete all of these in order correctly to get food.
Tandem Schedule - multiple simple schedules ran consecutively
Must complete ALL schedules IN ORDER
Reinforcer delivered after ALL schedules were completed.
No stimulus to signal transition
Concurrent schedules - two or more different simple schedules are available at the same time
2 or more different behaviors can be reinforced
Choice between schedules
Example: a pigeon may have the option of pecking a red disc on a VR 10 schedule or pecking a yellow disc on a VR 50 schedule.
Matching Law - given a choice between 2 behaviors, each with their own schedule, distribution in choice between behaviors matches availability of reinforcement.
Example: given the choice between stuffing envelopes on a FR10 schedule vs. a FR15 schedule for 5 dollars, the employee may choose the FR10 schedule, as it is less work for the same amount of money.
Example 2: given the choice between doing multiplication drills on an FR5 schedule for an M&M, or on an FR20 schedule for a snickers bar, a child may choose the snickers bar reward even if it means studying for longer.
Chapter 8 - Operant Learning: Punishment
Edward Lee Thorndike
Two consequences of behavior
Satisfying state of affairs - positive consequences increase behavior
Annoying state of affairs - negative consequences decrease behavior
College students presented with uncommon English and Spanish words, needed to choose synonym
Correct responses increase behavior
Wrong responses had no change in behavior
B.F Skinner
Known for his skinner box experiment with rats
Rats increasingly pressed lever for food
During extinction, some rats also received a slap when pressing lever
Rats experiencing punishment had markedly decreased lever pressing
When lever pressing ended, rats returned to lever pressing
Behavior is strengthened or weakened by its consequence
The Power of Punishment
Thorndike and Skinner underestimated effects of punishment on learning process
Thorndike and Skinners research primarily focused on reinforcement
There are times when “not doing something” is the desired behavior.
Weakening of Behavior
Punishment: decrease in strength of behavior. Less likely to occur in the future.
Positive Punishment: stimulus is added, behavior decreases in the future
Negative Punishment: stimulus is removed, behavior decreases in the future
Positive Punishment
A stimulus that is disliked by the individual is added, decreasing likelihood of behavior occurring in the future
Individualized to each person
Examples: reprimands, corporal punishment, electric shock
Negative Punishment - AKA penalty training
A stimulus that is preferred by an individual is removed and will decrease the likelihood of a behavior occurring in the future
Penalty training
Examples: loss of privileges, fines (loss of money), time out
Variables affecting Operant Learning
Contingency - correlation between behavior and its consequences
The more CONSISTENT and RELIABLE punishment is, the more effective punishment procedure will be at reducing behavior
THINK: continuous schedules of reinforcement
Contiguity - time between behavior and delivery of its consequences
The more immediate the punishment occurs following the behavior, the faster the learning curve/reduction of behavior.
Punishment Characteristics
Intensity: the greater the intensity of the punishment, the greater the reduction of behavior
Introductory level of punishment: the stronger the initial level of punishment, the faster and more permanent the reduction of behavior
Starting with a weak, non-effective punisher leads to risk of INCREASING tolerance to it.
Reinforcement and Punishment
Reinforcement of punished behavior: must consider the natural reinforcement of the behavior we look to reduce.
Meaning, the behavior was reinforced ALREADY, because it wouldn't occur otherwise.
Alternative sources of reinforcement: to increase the effectiveness of punishment procedure, offer an acceptable alternative form of reinforcement that will replace the unwanted behavior
Giving a rat an alternative way to find food. Punishment can suppress behavior when there is an alternative
Motivating Operations: reduction in deprivation will increase effectiveness of punishment procedure
Social isolation works more as a punisher if the person is socially “hungry”.
Theories of Punishment
Two process theory: includes classical conditioning and operant conditioning
Classical conditioning: CS signals need to avoid
Operant conditioning: avoidance
This is NOT supported by evidence as there are cases where one “catches” themselves engaging in behavior and does not make contact with the punisher
A child may begin to call out an answer and be disruptive, but stops themself.
One process theory: operant learning only
Supported by research
When punishment is effective, it mirrors effects of reinforcement - same effects of behavior
Concerns with Punishment
Inadvertent reinforcement for punisher:
successful punishment procedures can become reinforcing to the punisher and they use more than necessary to decrease behaviors
Example: teacher uses time out to grade papers or create lesson plans in absence of disruptive children.
Side Effects of physical punishment
escape/avoidance behaviors, suicide, aggression, apathy, abuse by punisher, and imitation of punishment to others
Alternatives to Punishment
Response prevention: instead of punishing undesirable behavior, prevent the behavior from occurring in the first place.
Examples: Limiting access, modifying the environment, block attempts
Extinction: withhold ALL reinforcement
Not always possible outside of a controlled environment
Can be dangerous → extinction bursts.
Differential Reinforcement: a procedure that combines extinction and reinforcement of another (preferred behavior)
Differential Reinforcement of Alternative Behavior (DRA): teach a more desirable replacement behavior that serves the same purpose as an undesired behavior
Providing a rat reinforcement for pushing lever B, offer food for both A and B but A will be reduced.
Differential Reinforcement of Incompatible Behavior (DRI): teach a different behavior that cannot happen at the same time as the behavior you would like to reduce
Teaching a child to use their quiet voice. They cannot yell and speak softly at the same time.
Differential Reinforcement of Low Rate (DRL): reinforce behavior when it occurs less often. Used to reduce, not eliminate behavior.
Praising a disruptive child when they are sitting and on task.
CHAPTER 9: Operant Applications
HOME
Reinforcement at home
providing attention to a crying baby
shaping a child’s language development
teaching delayed gratification
Punishment at home
time out: can be implemented poorly - telling children to go to their room does not work because teenagers love to be in their room
Differential reinforcement of incompatible behavior: telling a child to use their inside voices because they cannot yell and speak softly at the same time
Differential reinforcement of low rates: it is not reasonable to expect them to do every single homework assignment, so parents can reinforce the child progressively completing it and decreasing non-compliance
Corporal Punishment
SCHOOL
Reinforcement at school
providing praise and social attention for good behaviors
immediate feedback
using NATURAL REINFORCEMENT - a correct response means moving on to a new lesson
Punishment at school
ignoring poor behaviors
Differential reinforcement of low rates: to reduce the need for attention. if a student runs around the classroom, reduce problem behavior in which student looks for attention
praise the good behavior, and ignore the bad = changes in children’s behavior
CLINIC
Reinforcement at a clinic:
self-injurious behavior
Differential reinforcement of incompatible behavior: put a splint on their arm or problem area, so they cannot use it or mess with it
Differential reinforcement of low rates: reinforce them doing it LESS
Delusions
Differential reinforcement of alternative behavior: provide ALTERNATIVES for their delusion. What OTHER possibilities are there to your delusion?
Paralysis
Constraint induced movement therapy: training a person to use the body part that is paralzed by bounding their abled side.
Punishment at a clinic
Self-injurious behavior:
electrick shock: last resort
physical restraints: can look like punishment by reducing self injury. Can become punishing. Last resort, done ONLY in clinincal settings.
WORK
Reinforcement at work
Positive feedback: a supervisor telling you what you are doing right. Leads employees to make better decisions more often
Bonuses: at no loss to the company
Time off: days off in general. Extra days off due to good performance. this makes the employee want to make good decisions more often
Punishment at work
negative feedback: goal must be to increase productivity. Telling an employee what they are doing wrong
ZOO
Reinforcement at the zoo
Clicker training: common with any form of animal training. Sound of a clicker is paired with behavior you wish to increase
Shaping: teaching skills for the first time
Immediate reinforcement
Elephant example: the clicker was paired with getting the elephant closer to the carrot and hole in the wall. The tester reinforced lifting a foot off the ground, then progressive behaviors
Natural reinforcement: via food searches to mimic the animal’s natural environment.
introducing maze-like features in the zoo = more motivation and helps with the animal’s instincts in captivity
Chapter 10: Observational Learning
Edward Lee Thorndike
Puzzle box experiment with cats
we do not learn from observation, only from individual operant learning
Social observational learning
O (Mb —> S+/-)
O = observer
M = model
S+ = positive consequence
S- = negative consequence
Vicariously Reinforced: consequence of a model’s behavior strenghthens the observers tendency to behave similarly
Vicariously punished: consequences of a model’s behavior weakens the observed tendency to behave similarly
Example: you watch your classmate get scolded for writing on the exam booklet. You will likely not write on the booklet because it was vicariously punished.
Asocial Observational Learning
O(E —> S+/-)
O = observer
E = event
S+ = positive consequence
S- = negative consequence
No vicarious reinforcmenet or punishment, as there is no model
Example: you enter a room with a box of money and a sign that says to take the money and run. You will likely take the money and run.
Imitation: performance of modeled behavior
Imitation and reinforcement: we imitate behaviors that do not yield reinforcement, but sometimes they do have reinforcing qualities
over imitation: we imitate irrelevant behaviors. Increases with AGE in humans. not seen in other primates.
example: children still imitated “extra” steps to complete a task, but when it was modeled to them with such, they still imitated it
Generalized imitation: reinforcement of general tendency to imitate
tendency to imitate acts increased —> tendency to push a lever increased even though lever pressing was not reinforcing
requires a model and observer to follow exactly as is. Imitation of a model leads to learning
Variable affecting operant learning
Difficulty of task: the more difficult a task, there is decreased learning during observation
however, the observation of modeled behavior increases the observers future sucesses in learning it.
Skilled vs Unskilled models
benefits of a skilled model: observing the correct response everytime
benefits of an unskilled model: observing correcting AND incorrect responses, allowing for better evaluation of the “ideal” response
Characteristics of the model
Observers learn better from models that are attractive, likeable, prestigious, powerful, and popular
Characteristics of the observer
Characteristics that increase learning: language, learning history, age (young people imitate older people), gender (females imitate their mothers), older observers retain more of the model’s behavior than younger ones
characteristics the may limit learning: developmental/intellectual disabilities - Autism
Consequences of observed acts
imitation INCREASES when a model’s behavior is reinforced
imitation DECREASES when a model’s behavior is punished
Consequences of observed behavior
imitation increases when the observers behavior is reinforced upon imitation
imitation decreases when the observers behavior is punished upon imitation
Theories of Observational Learning
Albert Bandura - Social Cognitive Theory: cognitive processes account for learning from models. Attentional, retentional, motor reproductive, and motivational.
ATTENTIONAL: attending to models behavior and consequences
involves SELF-DIRECTED exploration within the observer
construction of meaningful perception from ongoing modeled events
less about the product of the model’s behavior and more so about the ONLOOKERS understanding of it
What we derive from what we look at
RETENTIONAL
Encoding: find a way to store information for future use
Retrieval: find a way to grab this information. When should it be brought up?
Reproduction: once i bring it up, how do i do it myself?
MOTOR REPRODUCTIVE: taking retained information and putting it into action in attempts to perform model’s behavior
using images from retentional process and allowing them to guie us as we attempt to perform the action
Motivational: expectation for consequences. Not the ACTUAL consequences, just the observers perceptions of them.
when favorable incentives are produced, observational learning emerges.
Operant Learning Model: modeled behavior and consequences serve as cues for reinforcement and or punishment of observers’ behavior
ATTENTION: overt behavior (eye contact, shifts in gaze, tracking of objects/behavior etc). influence from the environment
RETENTION: act of performing observed behavior
Things people do to help their performance
Motor reproduction: imitation, overt performance
Motivation: based on actual reinforcement of observers’ behavior when performing modeled tasks
we may expect to obtain a reward for our act, but expectation and out imitation are products of prior events.
CHAPTER 11 - Generalization, Discrimination, and Stimulus Control
Generalization - the tendency for the effects of learning experiences to spread
Types of Generalization
generalization across people (vicarious generalization)
generalization across time (maintenance)
generalization across behaviors (response generalization)
generalization across situations (stimulus generalization)
Generalization across people (vicarious generalization)
generalization of a model to those of a behavior
observational learning: equivalent to this. For example, a son observes his father shaving, and then imitates what he does
Generalization across time (maintenance)
generalization of behavior over time. As long as we maintain behaviors, we can access skills we have learned in the past (like bike riding).
Generalization across behaviors (response generalization)
The tendency for changes in one’s behavior to spread to other behaviors, such as how to behave at a soccer game
Generalization across situations (stimulus generalization)
the tendency for changes in behavior in one situation to spread to other situations
e.g: rotary phones and smartphones: they both have the same dialing technique, and you can take from your experience with rotary phones and and apply it to smartphones
Stimulus Generalization
Research including stimulus generalization
Pavlovian conditioning: dogs salivated in response to different tones and different decibels of the same tone
Little Albert: Albert was conditioned to fear rats, and without prior exposure, was fearful of other white furry stimuli (rabbits, Santa Claus)
Thorndike puzzle box: cats performed the same behavior (clawing, pulling on a lever, etc) to escape each new box.
Generalization gradient: how alike (or different) a conditioned response is from a stimulus that resembles the conditioned stimulus
Flat: no discrimination, high generalization
Broad: some discrimination, some generalization
Narrow: high discrimination, low generalization
Extinction, Punishment and Reinforcement
Stimulus generalization: applied to extinction, punishment, reinforcement
How to increase generalization
provide training in a variety of different settings
e.g: teaching children to sit still in class, music, and art so that they know that there is an expectation that sitting is a school behavior
provide many examples
provide a variety of different consequences
vary schedules of reinforcement, type of reinforcer
reinforce generalization when it occurs
Stimulus generalization - pros and cons
Pros: increases learning of new material, setting, etc, decrease the need for many specific trainings, increase the independence of learners
Cons: behavior may not be appropriate in all settings, resources may not be available in all settings, can be taken for granted by instructor, hate crimes
Discrimination: the tendency of behavior to occur in certain situations but not in others. the opposite of generalization.
discrimination training
classical conditioning: conditioned stimulus (CS+) is paired with its unconditioned stimulus (US), while another (CS-) is presented alone
operant conditioning: discriminative stimuli. (SD signals reinforcing consequences, S∆ signals lack of reinforcing consequences)
Simultaneous discrimination training
both SD and S∆ are presented at the same time, where SD yields reinforcing consequences and S∆ yields no reinforcing consequences.
Successive discrimination training
the SD and S∆ are presented individually and alternate randomly
Matching to sample (MTS)
given two or more alternates, the learner is presented with the SD and must match it to the SAME image/ item in an array of alternatives
Oddity matching or mismatching
given two or more alternates, the learner is presented with the SD and must match it to the DIFFERENT item/ image in the array of alternates
Errorless discrimination training
in the training phase, the instructor PROMPTS the correct response before any error can be made by the learner. an example would be using hand-over-hand guidance.
reduces negative emotional responses
increases the rate of learning
Differential outcomes effect (DOE)
when teaching multiple behaviors simultaneously, by reinforcing immediately for one behavior and delaying reinforcement for another correct response, the rate of learning for both individual correct responses increases.
Stimulus Control: when discrimination training brings behavior under the influence of discriminative stimuli
if someone always eats food in the kitchen, the sight of a kitchen may make them hungry!
Concept: any class the members of which share one or more defining features
a Yorkie, a Cocker Spaniel, and an Italian Greyhound are all different but still represent dogs in general.
CHAPTER 12: Forgetting
What is Forgetting?: the deterioration in performance of a learned behavior following a period in which learning or practice does not occur.
Forgetting and Stimulus Control
all behavior can be said to fall under some degree of stimulus control because some behavior can occur in the presence or absence of environmental stimuli.
forgetting could be a shift in stimulus control due to a change in the current environment in comparison to the original environment where initial learning took place
Measuring Forgetting
free recall
giving an opportunity to perform a previously learned behavior.
the traditional measure of forgetting
does not account for partial retention of behavior or skill
prompted/cued recall
give a hint or prompt when providing an opportunity to perform a previously learned behavior
this allows for the display or partial retention of behavior itself
relearning method/saving method
measuring the amount of training required to reach a previous level of performance
recognition
identifying material that was previously learning
different than prompted recall as there is no hint, only the correct and incorrect responses are presented
Measurements Used in Animal Research
Delayed matching to sample: give a sample briefly, then matching is expected after a “retention interval” has elapsed
Extinction method: put a behavior on extinction after a retention interval. the faster the behavior is put on extinction the greater the forgetting. NO REINFORCER
Gradient degradation: increased generalization, decreased discrimination yield higher rates of forgetting.
Sources of Forgetting
degree of learning: the better something is learned, the more slowly it is forgotten. OVERLEARNING is learning beyond the mastery criteria.
Prior learning: the more meaningful the material, the easier it is to retain over time
prior experience creates “meaning”
prior experience can interfere with recall (proactive interference)
subsequent learning: we forget less when learning is followed by periods of sleep rather than activity
learning new material increases forgetting for previous learning (retroactive interference)
changes in context: there is an increase in forgetting when a learned behavior is expected in a new environment
cue dependant learning: decreases in performance of a previously learned behavior in the absence of a stimuli that was present at the initial time of learning
How to decrease forgetting
overlearning: training a new skill beyond the mastery criteria
practice with feedback: perform the skill and get feedback
positive feedback reinforces correct performance
constructive feedback allows the learner to correct errors and increase future performance
distribute practice: perform the skill over time aka distributed or spaced practice
avoid massed practice: repetitious practice in a short period
test yourself: period testing yields greater retention than studying
mnemonics: a device used for aiding recall (ROY G BIV)
context clues: learning in different environments yields greater retention of skills in multiple settings.
CHAPTER 13: The limits of learning
Learning is not inherited
behavior acquired through learning is not passed from one generation to the next
reflexes and modal action patterns are inherited and consistent across a species
the benefit to individual learning is that we can adapt and change to our environment in real-time and have the ability to be innovative
Learning ability and Heredity
the differences in learning abilities between similar species (domesticated dogs vs wild wolves)
the difference in learning abilities within a species (the offspring of an artist vs of a scientist)
Heredity is not the ONLY factor; enriched environments are important too.
Critical Periods
a period in development of an individual when they are more likely to learn a particular behavior
example: bonding between a mother and infant shortly following birth
imprinting: the tendency of some animals to follow the first moving object they see after birth; not always their mother.
Harlow’s experiments with surrogate mothers - monkeys
monkeys chose warmth/comfort > food
monkeys relied on surrogate mothers for comfort in new environments, protection when afraid, and confidence to defend themselves or explore something new
monkeys lacked social skills that could not be taught by surrogate mother (interaction with peers/mating)
critical periods are not clearly defined in humans
Harlow’s experiments changed how we provide services in orphanages/human services
Evidence of critical periods for empathy in infancy/early childhood
Evidence of critical period for language development in the first 12 years of life
Preparedness and Learning
learning occurs differently in different situations
instinctive drift: the tendency of an animal to revert to fixed action patterns
autoshaping: the innate tendency to engage in behavior associated with food without receiving a reinforcement
learning occurs on a continuum of preparedness
somethings are learned with ease, while others are difficult to learn
animals that come to learning situations genetically prepared, for example, humans fear snakes over flowers.
animals that come to learning situations unprepared: learning proceeds slowly and steadily (no prior knowledge, not genetically prepared)
animals that come to learning situations contraprepared: learning proceeds slowly and irregularly
CHAPTER 1: Learning to Change
Charles Darwin
breeding - artificial selection
collection and study of animals
scarcity of resources as the population increases
species changed over time
survival based on having features with an “advantage”
during reproduction, 2 parents with an advantage will pass the feature to the offspring
predates work on inheritance
Natural selection
the process by which living organisms adapt to their environment over time using preferred traits being passed on through reproduction
requires variations within a species
environmental change can affect the natural selection process
climate change
predatory patterns
disease
applies also to behavior
Evolved Behavior: innate and adaptive forms of behavior
Reflexes: the relationship between a specific event and a simple response to that event
present in all members of a species
protection from injury
aid in food consumption
adaptive equipment of the animals
stereotypic (similar in form, frequency, strength, and time during development)
now always useful to a species and will die out over time
Modal action patterns: series of related acts found in all members of a species
genetic bias
little variability between members of a species
little variability across time
reliability elicited by a releaser (particular kind of event)
contribute to the survival of a species
protect individuals from the environment
more complex than reflexes
involve the entire organism, not just a few muscles or glands
long series of reflex-like acts
more variable than reflexes
not found in humans
General behavior traits: any general behavior tendency that is strongly influenced by genes
present in a variety of situations
does not require the presence of a releaser
less stereotypic / more variable than MAP
benefits to general behavior traits based on situation
example is being an easily angered person
Limits of Natural selection
slow process (occurs over generations)
previously valuable adaptations can become useless in a short period of time
not beneficial to the survival of the individual, but to survival of the species
mutations - abrupt changes in genes, may or may not be beneficial to person, unpredictable
hybridization - cross-breeding of closely related species, results in sterile offspring, takes one generation to see change
Learning
behavior and experience
behavior: anything that someone does that can measured
firing of a neuron to running a marathon
private events: thoughts and feelings
experience: changes in environment
learning: change in behavior due to change in environment
stimuli: physical changes in the environment
exclusions to learning: changes due to drugs, injury, aging, and disease
Habituation: reduction in tendency or probability of a response with repeated exposure to a stimulus
sensitization: an increase in the intensity or probability of a response within repreated exposure to a stimulus
Chapter 2: the study of learning and behavior
The natural science approach
four assumptions of natural phenomenon
all-natural phenomena are caused
the causes precede their effects
the causes of natural events include only natural phenomenon
the simplest explanation that fits the data is best
All natural phenomena are caused
things don’t “just happen”
determinism: the behavior of living organisms is based on cause and effects
the world is a lawful place
causes precede their effects
events cannot reach into the past to change behavior
experimentation: the act of controlling variables to determine the effect of one variable on phenomena
Natural causes of events include only natural phenomenon
cannot attribute natural events to acts of God, spirit, etc
empiricism: objective observation of phenomenon
the simplest explanation that fits the data is the best
parsimony: The simplest and most logical explanation is often the correct explanation and least contrived
fewest assumptions and extraneous variables
Measures of learning: measuring changes in behavior
how do we measure learning?
reduction in errors
changes in topography (form of behavior)
changes in intensity (force)
changes in speed (fast or slowness of behavior)
reduced latency (time between behaviors)
changes in rate (# of occurrences per unit of time)
increase in fluency (correct responses per unit of time)
combo of error and rate
Sources of data
Anecdotes: first or second-hand accounts, generally less specific. more qualitative, can provide leads. popular wisdom.
Case studies: provides more details than anecdotes.
lacks generalization (unique to the patient)
not representative of the entire group
takes a long time
cant determine cause/effect
self-reports (decrease validity)
Descriptive studies: review of group data
interviews
questionnaires
statistical analysis
can suggest but not test the hypothesis
Experimental studies
manipulation of one or more variables
contains a control
cause and effect
correlations
can be seen as artificial due to increased control
necessary to isolate the effects of IV
lab experiments provide better control
field experiments provide realistic approaches
Experimental components
Independent variable (IV) - manipulated with treatment
dependant variable (DV) - a variable that is measured
Experimental designs
Between subjects designs (group designs)
experimental group vs. control group
matched sampling
within-subject experiment (single subject design)
baseline
individuals as their own control
ABA reversal (treatment - withdrawal - treatment)
Animal research and human learning
PROS
control over heredity influences (breeding)
control over learning history (housed at birth)
APA guidelines for handling animals for research
CONS
generalization across species
practical vs theoretical value
animal rights
CHAPTER 3: Pavlovian Conditioning aka Classical Conditioning
Ivan Pavlov: physiologist (circulatory and digestive system)
shift to psychology
documenting reflexes (salvation) to change in environment (presentation of stimulus)
Reflexes
unconditioned:
inborn
same for all members of a species
permanent
Conditioned
not present at birth
acquired through experiences
change over time
unique to individual
Unconditioned reflex
unconditional stimulus (US) —> unconditioned response (UR)
Meat powder —> salivation
typically IMPORTANT to survival
Conditioned reflex
conditional stimulus (CS) —> conditional response (CR)
food dish —> salivation
How does a neutral stimulus become a conditioned stimulus
pairing: a process by which conditional stimulus regularly precedes an unconditional stimulus
conditional stimulus (CS) —> unconditional stimulus —> unconditional response
clap — meat — salivation
Pairing: after several trials, this chain becomes…
clap — salivation
Pavlovian conditioning - 2 key features
the behavior involves a reflex response
the conditional stimulus and unconditional stimulus pairing occurred regardless of what the individual response is
Pavlovs dogs
step one: unconditional stimulus — unconditional response
meat powder — salivation
step two: pair unconditional stimulus and neutral stimulus —> to get an unconditional response
step three: neutral stimulus = conditioned stimulus, after several trials the metronome (neutral) is a conditioned stimulus
step four: Conditioned stimulus leads to conditioned response
metronome —> salivation
Everyday examples of Classical conditioning
advertising
unconditional stimulus —> unconditional response
Colin Kaepernick —> inspiration
US (Kaepernick) + NS (Nike) = UR (Inspiration)
CS (NIKE) = CR (Inspiration)
pain
US (electric shock) —> UR (pain)
US (electric shock) + NS (cake) —> pain UR
CS (cupcake) —> pain CR
fear
US (drowning) — UR (fear)
US (drowning) + NS (rowboat) —> UR (fear)
CS (rowboat) = CR FEAR
Higher Order Conditioning
pairing a neutral stimulus with an established conditioned stimulus
Classical Conditioning
1 US is paired with 1 NS = CS
1 NS can be interchangeable with other neutral stimuli
only 1 NS can be presented at a time with the US
Higher order conditioning
established CS is paired with a new NS = CS
no need to pair a new neutral stimulus with an unconditional stimulus
multiple new neutral stimuli can be paired with an established CS to elicit a conditioned response
Examples of higher-order conditioning
US (light) —> UR (blink)
US (light) + NS (tap head) —> UR (blink)
CS (tap head) —> CR (blink)
higher order begins here — CS2 (SNAP) + CS1 (tap head) — blink
CS (snap) —> CR (blink)
We’ve added a second conditioned stimulus which is paired with the established CS of tap head.
Measuring Pavlovian Learning
Latency: the time between presentation of CS and CR
Test trials: intermittently present CS alone (no US), do you still get the CR? (an example is presenting “tap head” alone).
Intensity: the more established the CS —> CR the stronger the CR
Pseudoconditioning
occurs when an NS closely follows or is presented at the same time as a US, creating a perceived elicited CR.
test trial measures can determine if there is classical conditioning or pseudoconditioning present.
example: a rat is presented with various loud noises WHILE being presented with food and it may salivate at noise alone.
Variables affecting Pavlovian conditioning
pairing stimuli - how CS and US are paired
trace conditioning: CS begins and ends before US. There is a gap between CS and US
delay conditioning - CS is still present when US is introduced, OVERLAP.
simultaneous conditioning - CS is presented at the same time as US. NO time difference.
backward conditioning - CS is presented after US. generally ineffective.
Contingency - one event is dependant on another event. If x, then y.
typically the more certain or often the CS is presented with the US, the stronger the pairing but now always.
Contiguity - closeness in time between 2 events
typically the closer in time the CS is presented to the US, the stronger the pairing, but not always.
interstimulus interval: ISI. the interval of time between CS and US
Compound features: the presentation of 2 or more CS at the same time. Will there be a greater effect with the presentation of both CS?
overshadowing: when 2 stimuli are presented at the same time. one will produce a more effective CR due to the intensity. intense stimuli will typically overshadow the weaker stimuli.
overall, the more intense CS produces more reliable CR
however, intense stimuli can interfere with learning
Previous experience
latent inhibition: the appearance of NS without US interferes with the ability of NS to become a future CR. prior experience may undermine a new contingency. a new stimuli is more likely to become a CS.
Blocking: one stimulus effects the ability of another stimulus from becoming a CS due to prior experience
sensory preconditioning: when two NS are often found together prior to the pairing of one NS with US. Once one of the NS is paired with the US, the other NS will becomes a CS more easily
Timing: the more often the CS and US are paired together, the stronger the CR
with successive trials, the earlier CS-US pairings create a greater impact on CC
intertrial interval: the time between each CS-US pairing trial
longer intervals are more effective than shorter
shorter ISI more effective than longer
Extinction of Conditional Responses
extinction: a process by which a conditioned stimulus (CS) is repeatedly presented without the unconditioned stimulus (US) weakening the conditional response (CR).
CS (metronome) + US (food) = CR (salivation)
CS without US of food = CR
Extinction is NOT forgetting
Extinction is decreased performance due to lack of pairing of two stimuli
forgetting is decreased performance due to lack of practice
Extinction is Learning
the pairing of a stimulus with the absence of a previously paired stimulus
learning NOT to do something
takes practice (repeated trials)
can experience spontaneous recovery
re-emergence of a previously extinguished conditioned response
Theories of Conditioning: why does CC work?
Stimulus Substitution Theory (Ivan Pavlov)
suggests that CR = UR (conditioned stimulus = unconditioned stimulus)
the neurological connection between US + UR and the neurological connection between the CS and CR are the SAME
US and UR have innate neurological connections
CS and CR have acquired neuro connections through learning
CS serves as a substitute for the US to elicit a reflex
1973 Jenkins and More: pigeons pecked at lighted key after food was paired with light
1A. CONCERNS WITH STIMULUS SUBSTITUTION THEORY
CR does not equal UR
CR is weaker than UR
CR is less reliable UR
can not explain blocking/latent inhibition
CR can be the opposite of UR
Preparatory Response Theory: Gregory Kimble (1967)
UR is an innate response to deal with a US
CR is an acquired response to prepare for the US
a common explanation for drug tolerance
CS (environmental cue) —> CR (preparatory state to create homeostasis with the introduction of drugs)
user will need to increase the amount of drugs they take to get the same effect because the body prepares itself to self regulate with the addition of EXPECTED drug use.
Compensatory Response Theory: Siegal 1972
CR prepares the animal for US by compensating its effects
Common explanation for a fatal drug overdose with frequent users
CS (environmental cues) —> CR (body compensates the intro of drug)
when environmental cues are NOT present the body is not ready for the usual drug intake, which can lead to OVERDOSE
Example: drinking in a familiar bar vs an unfamiliar bar.
Recorla-Wagner model - 1972
There is a limit to pairing in CC
contributing factors - nature of US, # of CS-US pairing trials, limit to the CR
each successive CS-US pairing yields less learning
the greatest percentage of learning occurs in 1st trial
this model ACCOUNTS FOR BLOCKING
one CS will “use-up” more learning, leaving less available for the second CS.
CHAPTER 4: Applications / Examples of Classical Conditioning
Fear-based
unconditional stimulus (dog bite) —> unconditioned response (fear)
neutral stimulus (dog) + unconditional stimulus (dog bite) —> fear
dog (conditioned stimulus) —> fear (conditioned response)
Drug addiction
unconditioned stim (fentanyl) —> unconditioned response (euphoria)
neutral stimulus (syringe, room) + unconditioned stimulus (fentanyl) —> unconditioned response (euphoria)
conditioned stimulus (syringe, room) —> conditioned stimulus (euphoria)
Advertising
unconditioned stimulus (gym) —> unconditioned response (health, good body)
neutral stimulus (the rock) + unconditioned stimulus (gym) —> unconditioned response (health, good body)
CS (the rock) —> CR (health, good body)
Taste Aversion
unconditioned stimulus (maggots) —> unconditioned response (nausea)
neutral stimulus (reese’s cup) + unconditioned stimulus (maggots) —> unconditioned response (nausea)
CS (Reese’s cup) —> CR (nausea)
Chapter 5: Operant Learning and Reinforcement
Edward Lee Thorndike
studied animal learning
presented the same problem numerous times to see if performance improved
known for his puzzle box experiment with cats
Cats would try numerous inefficient maneuvers to escape an enclosed box to reach food
To open the box there would be a simple mechanism of pulling a loop or stepping on a treadle
each successive trial shorter in duration
two consequences: satisfying state affairs and annoying state of affairs
Law of Effect: behavior is a function of its consequence
the relationship between behavior and its consequence
four key elements
behavior
environment
change in behavior due to its environment
change in environment due to behavior
B.F Skinner
studied animal learning
known for his Skinner box experiment with rats
learned that rats increasingly pressed the lever for food
behavior operates on the environment
operant conditioning
behavior is strengthened or weakened by its consequences
TYPES OF OPERANT LEARNING
Strengthening Behavior - reinforcement
increase in the strength of behavior
behavior is more likely to occur in the future
positive reinforcement - stimulus is added, behavior increases in the future
negative reinforcement - stimulus is removed, behavior increases in the future
positive reinforcement
reward learning: adding a preferred stimulus that will increase the occurrence of behavior in the future
positive reinforcing: a stimulus that is preferred by an individual that increases the likelihood of a behavior occurring in the future, individualized to each person
negative reinforcement
escape avoidance learning: removing a non-preferred stimulus that will increase the occurrence of behavior in the future
negative reinforcer: a stimulus that an individual would typically avoid or try to escape, which the removal of will increase the likelihood of a behavior occurring in the future
Kinds of reinforcers
primary reinforcers: innately effective / no learning history required. examples are food, water, sex, sleep, shelter, social, love, control
satiation: reduction in reinforcing effects of a given reinforcer due to increased availability aka lack of need
food is not reinforcing when I am full,
deprivation: increase in the reinforcing effects of a given reinforcer due to decreased availability aka increased need
food is reinforcing if i am hungry
secondary reinforcers: conditioned reinforcers. learning through experiences (pairing with other reinforcers). weaker than primary reinforcers and satiate slower than primary reinforcers. effectiveness relies on primary reinforcers (praise, preferred items/activities).
generalized reinforcers: paired with many different reinforcers. can be used in a wide variety of situations
money, token boards
natural reinforcers: automatic reinforcers. spontaneously follow behavior
a jacket when I am cold
contrived reinforcers: manipulated by someone for the purpose of modifying behavior
a sticker for completing homework
Variables Affecting Operant Learning
contingency - correlation between behavior and consequence - If X then Y
to receive a reinforcer, one must do the behavior
is the reinforcer worth the behavior?
contiguity - the time between the behavior and the reinforcing consequences
more immediate reinforcement, faster learning curve
reinforcer characteristics
magnitude
frequency
quality - is it worth my time
behavior characteristics
magnitude
frequency
quality - is it worth my time
behavior characteristics
difficulty
biological makeup relative to the task
Motivating operations: changes the effectiveness of a reinforcer and the behavior reinforced by that given reinforcer at the moment
establishing operations - increase the value of a reinforcer, increase the frequency of behavior to get it
abolishing operations - decrease the value of a reinforcer, decrease the frequency of behavior to get it
Neuromechanics of Reinforcement
Olds and Milner - electrode connected to rats’ brains in the septal region, free to move but rats pushed a lever to deliver an electric shock
pushed lever constantly did not eat, mate or anything
electric stimulation was very reinforcing - wireheads
brain’s reward center - located in the septal region of the brain between 2 hemispheres
dopamine production: neurotransmitter responsible for a natural “high”. Good experiences naturally produce dopamine. triggers found outside the body
amounts of dopamine vary due to different situations/substances
unexpected events produce more dopamine than expected ones (Rescorla Wagner model)
dopamine is converted to epinephrine (adrenaline)
Theories of Positive Reinforcement
drive reduction theory (Hull) - all behaviors are due to motivational states called drives (think MOs). works well to explain primary reinforcers
a reduction in physiological needs (drive: hunger —> reinforcer: food)
does NOT explain secondary reinforcers (Hull expressed this using associations between primary and secondary reinforcers)
relative value theory (premark) - all behavior has relative values. there are NOT physiological needs.
no concern regarding primary vs secondary reinforcers
some relative values are greater than others (rats getting electrical stimulation vs food)
the more probable behavior will reinforce a less probable behavior
a given behavior can be more or less probable given circumstances
Premack principle: present less desired behavior and then preferred behavior
Response deprivation theory (William Timberlake and James Allison) - behavior becomes reinforcing when it is available less than normal
Theories of avoidance
two-process theory: includes classical conditioning and operant conditioning
classical conditioning: CS signals need to escape
Operant conditioning: avoidance
if the CS is no longer aversive, avoidance behavior persists
avoidance behavior does not extinguish with weakening of CS
Sidman avoidance behavior - use of regular time intervals and no conditioned stimulus
there was no signal of aversive stimulus to avoid, however, on a time schedule, rats still engaged in avoidance
Douglas Anger - time is a conditioned stimulus
Herrnstein and Hinclic - time is NOT a conditioned stimulus - average time will not get consistent behavioral outcomes
One process theory - only operant learning
escape and avoidance behavior is reinforced by the reduction of aversive stimulation
reduction in exposure to aversive stimuli is reinforcing
two process theorists that something not occurring cannot be reinforcing
To have extinction of avoidance behavior, you have to block the occurrence of avoidance and teach that aversive is no longer present
Chapter 6 - Reinforcement - Beyond Habit
Teaching New Behaviors - you cannot reinforce a behavior that does not occur
Shaping - reinforcement of successive approximations of a desired behavior
Reinforcement of a close enough behavior as you move towards the desired behavior
With increased trials, reinforce ALL improvements on the display of desired behavior
As reinforcement is provided for each approximation, only reinforce behavior that is equal to or better than prior approximations
Shaping occurs naturally in both humans and animals
Example of Shaping Behavior
Child has to say “ball” - reinforce the child saying the “b” sound first. Then, when they combine “b” and “a”, and finally “ll”.
Chaining - teaching a new skill by breaking down a complex task into smaller components, and teaching each component in successive order
Behavior chain - connected sequence of behavior
Task analysis - breaking down a task into its component elements
Each step reinforcers the following step
Types of Chaining
Forward chaining: reinforcing each step of the chain in order starting from the first step
Backward chaining: reinforcing each step of the chain in order starting from the last step.
Example of Forward Chaining
Video of a girl writing her name. You ask the child to first write the first letter as the teacher finishes it and reward the child. Repeat until they reach the end of their name, reinforcing each step.
Example of Backward Chaining
In a child washing their hands, first you guide the child's hands through all steps except the last one, which they must perform themselves. Upon completing the last step you give them a reinforcer (candy). You then do the same with the second to last, and then third to last step in the chain.
Chapter 7 - Schedules of Reinforcement
Schedules of Reinforcement - a certain rule describing contingency between a behavior and reinforcement
The relationship between a desired behavior and the reinforcement (reward) received.
Schedule Effects - the distinctive rate and pattern of behavior associated with a particular reinforcement schedule
How much or how often we engage in a particular behavior is determined by how we receive reinforcement
Continuous reinforcement - each time you engage in the behavior, you receive reinforcement
The simplest schedule, known as FR1
Leads to rapid responding
Not always practical/realistic
Intermittent schedules of reinforcement - reinforcement is received on some occasions, but not on every display of behavior
Fixed Ratio - a behavior is reinforced when it has occurred a fixed number of times (performance, with pauses after reinforcement)
Pause duration increases as ratio increases
Example: every basket scored within the paint is 2 points
FR3 - a behavior is reinforced every 3 presses
Variable Ratio - a behavior is reinforced when it has occurred an average number of times (steady rates of responding)
Pauses are less frequent than in fixed ratio
Example: casino slot machine - every so often you may win, but you don’t know when.
FR 5 - May be reinforced after 10 presses, but on average, you will be rewarded for every 5.
RATIO - number of times
Fixed Interval - a behavior is reinforced when it has occurred after a fixed duration of time has elapsed (correct response must occur)
Scallop shaped reasoning - pauses after reinforcement - why would you work at the 20 second mark if you are only reinforced after the 60 second mark?
Behavior increases in frequency closer to the interval that reinforcement is delivered
FI 5 second - for the next five seconds, a bird peck does not produce food. Only until the 5 second mark will food be delivered.
Baking a cake is an example. You must leave the cake in for 30 minutes. You may not start checking it until the 25 min mark.
Checking your watch when it gets closer to the end of class
Variable Interval - a behavior is reinforced when it has occurred after an average duration of time has elapsed
High, steady rates of responding compared to FI
VI 5 schedule - average interval between reinforced pecks is 5 seconds
Keeps people on their toes
Checking to see when someone has liked your picture. You do not know when it will happen.
Hunting a deer: sometimes the deer appears within seconds, sometimes you must wait hours
INTERVAL - duration of time
Extinction
The process by which behavior occurs but is not reinforced
Extinction Burst: sudden increase in the rate of behavior during the early stages of extinction. Followed by a steady decline. An example is telling a parent to simply ignore the child crying for the ipad, causing the child to scream louder. But the more you ignore the more they will stop,
Spontaneous Recovery - sudden reappearance of a behavior following extinction.
Resurgence - reappearance of a behavior that was previously reinforced during the extinction process.
Typically occurs when a replacement behavior (Y) is put on extinction and initial behavior (X) reappears in effort to make contact with reinforcement.
Continuous Time-Based Simple Schedules
Fixed Duration Schedule - a behavior is reinforced when it has continuously occurred for a fixed duration of time
Example: playing a sport for two hours and then having a snack
A child practices piano for a ½ hour and then receives reinforcement of milk and cookies given they practiced for the entire time.
Variable Duration Schedule - a behavior is reinforced when it has continuously occurred for an average duration of time
Example: when practicing a sport, you get a 5 minute water break on average every ½ hour.
A child practices piano - any session might end after 30, 45, 50 minutes, but on average, after ½ hour of practice they receive cookies and milk.
Concerns with Continuous Time-Based Schedule
How do you define and measure the continuous occurrence of the desired behavior
Does the reinforcer increase behavior outcomes? Does giving them a snack make them like piano / keep up with practice?
Does the availability of reinforcement match the effort to engage in desired behavior
Most often used
In conjunction with the Premack Principle - if eating cookies and drinking milk are reinforcing, and if this behavior is contingent on practicing the piano for some time, then playing piano should become reinforcing
When behavior leads to “natural reinforcement” - aka practice makes perfect logic
Noncontingent Time-Based Schedules - schedules of reinforcement INDEPENDENT of behavior
Fixed Time Schedule - a reinforcer is delivered after a given time regardless of behavior
Does not occur naturally
Used to create a state of satiation and reduce desire for a particular reinforcer
E.g - when a child is stealing candy from Halloween, you schedule 2 pieces of candy of their choice after dinner.
A pigeon receives food on an FT 10 schedule EVERY 10 SECONDS regardless of disc pecks or not.
Variable Time Schedule - a reinforcer is delivered at irregular time intervals regardless of behavior
Does not occur naturally
Used to create a state of satiation and reduce desire for a particular reinforcer
An aunt gives you money every time you see them, which is variable.
Checking on your child periodically while cooking so they do not come in harm's way
TIME SCHEDULES - depending on time
Fixed time - after 10 minutes
Variable time - irregular points in time
Progressive Schedules - systematically changing contingencies that describe the availability of reinforcer
Progressive schedules - can be applied to all simple schedules of reinforcement
Contingencies change with each trial
Amount of food might become smaller, the requirements for food might become larger, etc
Considerations - Progressive Schedules
Break Point - while using a progressive schedule, this is when a desired behavior dramatically stops or declines
Ratio Strain - stretching the ratio/interval of reinforcement too thin or too quickly. The demands are too strenuous. Example is workers who are overworked and underpaid.
Compound Schedules of Reinforcement
Multiple Schedules - one behavior, 2 or more simple schedules, with a known stimulus.
Example: a pigeon that has learned to peck a disc for grain may be put on a FR10 second schedule when a red light is on but on a VR10 schedule when a yellow light is on. Changes indicated by color change.
Mixed Schedules - one behavior, 2 or more simple schedules, no stimulus
Example: same pigeon example, except there is no light therefore the subject does not know when change is occuring.
Chain Schedule: multiple simple schedules ran consecutively.
Must complete all schedules in order
Reinforcer delivered after ALL schedules are completed.
Known stimulus to signal transition to diff schedule.
Example: Pigeon may be placed on a FR 10 FI 15 sec VR 20 schedule, with changing lights signaling each change. The pigeon must complete all of these in order correctly to get food.
Tandem Schedule - multiple simple schedules ran consecutively
Must complete ALL schedules IN ORDER
Reinforcer delivered after ALL schedules were completed.
No stimulus to signal transition
Concurrent schedules - two or more different simple schedules are available at the same time
2 or more different behaviors can be reinforced
Choice between schedules
Example: a pigeon may have the option of pecking a red disc on a VR 10 schedule or pecking a yellow disc on a VR 50 schedule.
Matching Law - given a choice between 2 behaviors, each with their own schedule, distribution in choice between behaviors matches availability of reinforcement.
Example: given the choice between stuffing envelopes on a FR10 schedule vs. a FR15 schedule for 5 dollars, the employee may choose the FR10 schedule, as it is less work for the same amount of money.
Example 2: given the choice between doing multiplication drills on an FR5 schedule for an M&M, or on an FR20 schedule for a snickers bar, a child may choose the snickers bar reward even if it means studying for longer.
Chapter 8 - Operant Learning: Punishment
Edward Lee Thorndike
Two consequences of behavior
Satisfying state of affairs - positive consequences increase behavior
Annoying state of affairs - negative consequences decrease behavior
College students presented with uncommon English and Spanish words, needed to choose synonym
Correct responses increase behavior
Wrong responses had no change in behavior
B.F Skinner
Known for his skinner box experiment with rats
Rats increasingly pressed lever for food
During extinction, some rats also received a slap when pressing lever
Rats experiencing punishment had markedly decreased lever pressing
When lever pressing ended, rats returned to lever pressing
Behavior is strengthened or weakened by its consequence
The Power of Punishment
Thorndike and Skinner underestimated effects of punishment on learning process
Thorndike and Skinners research primarily focused on reinforcement
There are times when “not doing something” is the desired behavior.
Weakening of Behavior
Punishment: decrease in strength of behavior. Less likely to occur in the future.
Positive Punishment: stimulus is added, behavior decreases in the future
Negative Punishment: stimulus is removed, behavior decreases in the future
Positive Punishment
A stimulus that is disliked by the individual is added, decreasing likelihood of behavior occurring in the future
Individualized to each person
Examples: reprimands, corporal punishment, electric shock
Negative Punishment - AKA penalty training
A stimulus that is preferred by an individual is removed and will decrease the likelihood of a behavior occurring in the future
Penalty training
Examples: loss of privileges, fines (loss of money), time out
Variables affecting Operant Learning
Contingency - correlation between behavior and its consequences
The more CONSISTENT and RELIABLE punishment is, the more effective punishment procedure will be at reducing behavior
THINK: continuous schedules of reinforcement
Contiguity - time between behavior and delivery of its consequences
The more immediate the punishment occurs following the behavior, the faster the learning curve/reduction of behavior.
Punishment Characteristics
Intensity: the greater the intensity of the punishment, the greater the reduction of behavior
Introductory level of punishment: the stronger the initial level of punishment, the faster and more permanent the reduction of behavior
Starting with a weak, non-effective punisher leads to risk of INCREASING tolerance to it.
Reinforcement and Punishment
Reinforcement of punished behavior: must consider the natural reinforcement of the behavior we look to reduce.
Meaning, the behavior was reinforced ALREADY, because it wouldn't occur otherwise.
Alternative sources of reinforcement: to increase the effectiveness of punishment procedure, offer an acceptable alternative form of reinforcement that will replace the unwanted behavior
Giving a rat an alternative way to find food. Punishment can suppress behavior when there is an alternative
Motivating Operations: reduction in deprivation will increase effectiveness of punishment procedure
Social isolation works more as a punisher if the person is socially “hungry”.
Theories of Punishment
Two process theory: includes classical conditioning and operant conditioning
Classical conditioning: CS signals need to avoid
Operant conditioning: avoidance
This is NOT supported by evidence as there are cases where one “catches” themselves engaging in behavior and does not make contact with the punisher
A child may begin to call out an answer and be disruptive, but stops themself.
One process theory: operant learning only
Supported by research
When punishment is effective, it mirrors effects of reinforcement - same effects of behavior
Concerns with Punishment
Inadvertent reinforcement for punisher:
successful punishment procedures can become reinforcing to the punisher and they use more than necessary to decrease behaviors
Example: teacher uses time out to grade papers or create lesson plans in absence of disruptive children.
Side Effects of physical punishment
escape/avoidance behaviors, suicide, aggression, apathy, abuse by punisher, and imitation of punishment to others
Alternatives to Punishment
Response prevention: instead of punishing undesirable behavior, prevent the behavior from occurring in the first place.
Examples: Limiting access, modifying the environment, block attempts
Extinction: withhold ALL reinforcement
Not always possible outside of a controlled environment
Can be dangerous → extinction bursts.
Differential Reinforcement: a procedure that combines extinction and reinforcement of another (preferred behavior)
Differential Reinforcement of Alternative Behavior (DRA): teach a more desirable replacement behavior that serves the same purpose as an undesired behavior
Providing a rat reinforcement for pushing lever B, offer food for both A and B but A will be reduced.
Differential Reinforcement of Incompatible Behavior (DRI): teach a different behavior that cannot happen at the same time as the behavior you would like to reduce
Teaching a child to use their quiet voice. They cannot yell and speak softly at the same time.
Differential Reinforcement of Low Rate (DRL): reinforce behavior when it occurs less often. Used to reduce, not eliminate behavior.
Praising a disruptive child when they are sitting and on task.
CHAPTER 9: Operant Applications
HOME
Reinforcement at home
providing attention to a crying baby
shaping a child’s language development
teaching delayed gratification
Punishment at home
time out: can be implemented poorly - telling children to go to their room does not work because teenagers love to be in their room
Differential reinforcement of incompatible behavior: telling a child to use their inside voices because they cannot yell and speak softly at the same time
Differential reinforcement of low rates: it is not reasonable to expect them to do every single homework assignment, so parents can reinforce the child progressively completing it and decreasing non-compliance
Corporal Punishment
SCHOOL
Reinforcement at school
providing praise and social attention for good behaviors
immediate feedback
using NATURAL REINFORCEMENT - a correct response means moving on to a new lesson
Punishment at school
ignoring poor behaviors
Differential reinforcement of low rates: to reduce the need for attention. if a student runs around the classroom, reduce problem behavior in which student looks for attention
praise the good behavior, and ignore the bad = changes in children’s behavior
CLINIC
Reinforcement at a clinic:
self-injurious behavior
Differential reinforcement of incompatible behavior: put a splint on their arm or problem area, so they cannot use it or mess with it
Differential reinforcement of low rates: reinforce them doing it LESS
Delusions
Differential reinforcement of alternative behavior: provide ALTERNATIVES for their delusion. What OTHER possibilities are there to your delusion?
Paralysis
Constraint induced movement therapy: training a person to use the body part that is paralzed by bounding their abled side.
Punishment at a clinic
Self-injurious behavior:
electrick shock: last resort
physical restraints: can look like punishment by reducing self injury. Can become punishing. Last resort, done ONLY in clinincal settings.
WORK
Reinforcement at work
Positive feedback: a supervisor telling you what you are doing right. Leads employees to make better decisions more often
Bonuses: at no loss to the company
Time off: days off in general. Extra days off due to good performance. this makes the employee want to make good decisions more often
Punishment at work
negative feedback: goal must be to increase productivity. Telling an employee what they are doing wrong
ZOO
Reinforcement at the zoo
Clicker training: common with any form of animal training. Sound of a clicker is paired with behavior you wish to increase
Shaping: teaching skills for the first time
Immediate reinforcement
Elephant example: the clicker was paired with getting the elephant closer to the carrot and hole in the wall. The tester reinforced lifting a foot off the ground, then progressive behaviors
Natural reinforcement: via food searches to mimic the animal’s natural environment.
introducing maze-like features in the zoo = more motivation and helps with the animal’s instincts in captivity
Chapter 10: Observational Learning
Edward Lee Thorndike
Puzzle box experiment with cats
we do not learn from observation, only from individual operant learning
Social observational learning
O (Mb —> S+/-)
O = observer
M = model
S+ = positive consequence
S- = negative consequence
Vicariously Reinforced: consequence of a model’s behavior strenghthens the observers tendency to behave similarly
Vicariously punished: consequences of a model’s behavior weakens the observed tendency to behave similarly
Example: you watch your classmate get scolded for writing on the exam booklet. You will likely not write on the booklet because it was vicariously punished.
Asocial Observational Learning
O(E —> S+/-)
O = observer
E = event
S+ = positive consequence
S- = negative consequence
No vicarious reinforcmenet or punishment, as there is no model
Example: you enter a room with a box of money and a sign that says to take the money and run. You will likely take the money and run.
Imitation: performance of modeled behavior
Imitation and reinforcement: we imitate behaviors that do not yield reinforcement, but sometimes they do have reinforcing qualities
over imitation: we imitate irrelevant behaviors. Increases with AGE in humans. not seen in other primates.
example: children still imitated “extra” steps to complete a task, but when it was modeled to them with such, they still imitated it
Generalized imitation: reinforcement of general tendency to imitate
tendency to imitate acts increased —> tendency to push a lever increased even though lever pressing was not reinforcing
requires a model and observer to follow exactly as is. Imitation of a model leads to learning
Variable affecting operant learning
Difficulty of task: the more difficult a task, there is decreased learning during observation
however, the observation of modeled behavior increases the observers future sucesses in learning it.
Skilled vs Unskilled models
benefits of a skilled model: observing the correct response everytime
benefits of an unskilled model: observing correcting AND incorrect responses, allowing for better evaluation of the “ideal” response
Characteristics of the model
Observers learn better from models that are attractive, likeable, prestigious, powerful, and popular
Characteristics of the observer
Characteristics that increase learning: language, learning history, age (young people imitate older people), gender (females imitate their mothers), older observers retain more of the model’s behavior than younger ones
characteristics the may limit learning: developmental/intellectual disabilities - Autism
Consequences of observed acts
imitation INCREASES when a model’s behavior is reinforced
imitation DECREASES when a model’s behavior is punished
Consequences of observed behavior
imitation increases when the observers behavior is reinforced upon imitation
imitation decreases when the observers behavior is punished upon imitation
Theories of Observational Learning
Albert Bandura - Social Cognitive Theory: cognitive processes account for learning from models. Attentional, retentional, motor reproductive, and motivational.
ATTENTIONAL: attending to models behavior and consequences
involves SELF-DIRECTED exploration within the observer
construction of meaningful perception from ongoing modeled events
less about the product of the model’s behavior and more so about the ONLOOKERS understanding of it
What we derive from what we look at
RETENTIONAL
Encoding: find a way to store information for future use
Retrieval: find a way to grab this information. When should it be brought up?
Reproduction: once i bring it up, how do i do it myself?
MOTOR REPRODUCTIVE: taking retained information and putting it into action in attempts to perform model’s behavior
using images from retentional process and allowing them to guie us as we attempt to perform the action
Motivational: expectation for consequences. Not the ACTUAL consequences, just the observers perceptions of them.
when favorable incentives are produced, observational learning emerges.
Operant Learning Model: modeled behavior and consequences serve as cues for reinforcement and or punishment of observers’ behavior
ATTENTION: overt behavior (eye contact, shifts in gaze, tracking of objects/behavior etc). influence from the environment
RETENTION: act of performing observed behavior
Things people do to help their performance
Motor reproduction: imitation, overt performance
Motivation: based on actual reinforcement of observers’ behavior when performing modeled tasks
we may expect to obtain a reward for our act, but expectation and out imitation are products of prior events.
CHAPTER 11 - Generalization, Discrimination, and Stimulus Control
Generalization - the tendency for the effects of learning experiences to spread
Types of Generalization
generalization across people (vicarious generalization)
generalization across time (maintenance)
generalization across behaviors (response generalization)
generalization across situations (stimulus generalization)
Generalization across people (vicarious generalization)
generalization of a model to those of a behavior
observational learning: equivalent to this. For example, a son observes his father shaving, and then imitates what he does
Generalization across time (maintenance)
generalization of behavior over time. As long as we maintain behaviors, we can access skills we have learned in the past (like bike riding).
Generalization across behaviors (response generalization)
The tendency for changes in one’s behavior to spread to other behaviors, such as how to behave at a soccer game
Generalization across situations (stimulus generalization)
the tendency for changes in behavior in one situation to spread to other situations
e.g: rotary phones and smartphones: they both have the same dialing technique, and you can take from your experience with rotary phones and and apply it to smartphones
Stimulus Generalization
Research including stimulus generalization
Pavlovian conditioning: dogs salivated in response to different tones and different decibels of the same tone
Little Albert: Albert was conditioned to fear rats, and without prior exposure, was fearful of other white furry stimuli (rabbits, Santa Claus)
Thorndike puzzle box: cats performed the same behavior (clawing, pulling on a lever, etc) to escape each new box.
Generalization gradient: how alike (or different) a conditioned response is from a stimulus that resembles the conditioned stimulus
Flat: no discrimination, high generalization
Broad: some discrimination, some generalization
Narrow: high discrimination, low generalization
Extinction, Punishment and Reinforcement
Stimulus generalization: applied to extinction, punishment, reinforcement
How to increase generalization
provide training in a variety of different settings
e.g: teaching children to sit still in class, music, and art so that they know that there is an expectation that sitting is a school behavior
provide many examples
provide a variety of different consequences
vary schedules of reinforcement, type of reinforcer
reinforce generalization when it occurs
Stimulus generalization - pros and cons
Pros: increases learning of new material, setting, etc, decrease the need for many specific trainings, increase the independence of learners
Cons: behavior may not be appropriate in all settings, resources may not be available in all settings, can be taken for granted by instructor, hate crimes
Discrimination: the tendency of behavior to occur in certain situations but not in others. the opposite of generalization.
discrimination training
classical conditioning: conditioned stimulus (CS+) is paired with its unconditioned stimulus (US), while another (CS-) is presented alone
operant conditioning: discriminative stimuli. (SD signals reinforcing consequences, S∆ signals lack of reinforcing consequences)
Simultaneous discrimination training
both SD and S∆ are presented at the same time, where SD yields reinforcing consequences and S∆ yields no reinforcing consequences.
Successive discrimination training
the SD and S∆ are presented individually and alternate randomly
Matching to sample (MTS)
given two or more alternates, the learner is presented with the SD and must match it to the SAME image/ item in an array of alternatives
Oddity matching or mismatching
given two or more alternates, the learner is presented with the SD and must match it to the DIFFERENT item/ image in the array of alternates
Errorless discrimination training
in the training phase, the instructor PROMPTS the correct response before any error can be made by the learner. an example would be using hand-over-hand guidance.
reduces negative emotional responses
increases the rate of learning
Differential outcomes effect (DOE)
when teaching multiple behaviors simultaneously, by reinforcing immediately for one behavior and delaying reinforcement for another correct response, the rate of learning for both individual correct responses increases.
Stimulus Control: when discrimination training brings behavior under the influence of discriminative stimuli
if someone always eats food in the kitchen, the sight of a kitchen may make them hungry!
Concept: any class the members of which share one or more defining features
a Yorkie, a Cocker Spaniel, and an Italian Greyhound are all different but still represent dogs in general.
CHAPTER 12: Forgetting
What is Forgetting?: the deterioration in performance of a learned behavior following a period in which learning or practice does not occur.
Forgetting and Stimulus Control
all behavior can be said to fall under some degree of stimulus control because some behavior can occur in the presence or absence of environmental stimuli.
forgetting could be a shift in stimulus control due to a change in the current environment in comparison to the original environment where initial learning took place
Measuring Forgetting
free recall
giving an opportunity to perform a previously learned behavior.
the traditional measure of forgetting
does not account for partial retention of behavior or skill
prompted/cued recall
give a hint or prompt when providing an opportunity to perform a previously learned behavior
this allows for the display or partial retention of behavior itself
relearning method/saving method
measuring the amount of training required to reach a previous level of performance
recognition
identifying material that was previously learning
different than prompted recall as there is no hint, only the correct and incorrect responses are presented
Measurements Used in Animal Research
Delayed matching to sample: give a sample briefly, then matching is expected after a “retention interval” has elapsed
Extinction method: put a behavior on extinction after a retention interval. the faster the behavior is put on extinction the greater the forgetting. NO REINFORCER
Gradient degradation: increased generalization, decreased discrimination yield higher rates of forgetting.
Sources of Forgetting
degree of learning: the better something is learned, the more slowly it is forgotten. OVERLEARNING is learning beyond the mastery criteria.
Prior learning: the more meaningful the material, the easier it is to retain over time
prior experience creates “meaning”
prior experience can interfere with recall (proactive interference)
subsequent learning: we forget less when learning is followed by periods of sleep rather than activity
learning new material increases forgetting for previous learning (retroactive interference)
changes in context: there is an increase in forgetting when a learned behavior is expected in a new environment
cue dependant learning: decreases in performance of a previously learned behavior in the absence of a stimuli that was present at the initial time of learning
How to decrease forgetting
overlearning: training a new skill beyond the mastery criteria
practice with feedback: perform the skill and get feedback
positive feedback reinforces correct performance
constructive feedback allows the learner to correct errors and increase future performance
distribute practice: perform the skill over time aka distributed or spaced practice
avoid massed practice: repetitious practice in a short period
test yourself: period testing yields greater retention than studying
mnemonics: a device used for aiding recall (ROY G BIV)
context clues: learning in different environments yields greater retention of skills in multiple settings.
CHAPTER 13: The limits of learning
Learning is not inherited
behavior acquired through learning is not passed from one generation to the next
reflexes and modal action patterns are inherited and consistent across a species
the benefit to individual learning is that we can adapt and change to our environment in real-time and have the ability to be innovative
Learning ability and Heredity
the differences in learning abilities between similar species (domesticated dogs vs wild wolves)
the difference in learning abilities within a species (the offspring of an artist vs of a scientist)
Heredity is not the ONLY factor; enriched environments are important too.
Critical Periods
a period in development of an individual when they are more likely to learn a particular behavior
example: bonding between a mother and infant shortly following birth
imprinting: the tendency of some animals to follow the first moving object they see after birth; not always their mother.
Harlow’s experiments with surrogate mothers - monkeys
monkeys chose warmth/comfort > food
monkeys relied on surrogate mothers for comfort in new environments, protection when afraid, and confidence to defend themselves or explore something new
monkeys lacked social skills that could not be taught by surrogate mother (interaction with peers/mating)
critical periods are not clearly defined in humans
Harlow’s experiments changed how we provide services in orphanages/human services
Evidence of critical periods for empathy in infancy/early childhood
Evidence of critical period for language development in the first 12 years of life
Preparedness and Learning
learning occurs differently in different situations
instinctive drift: the tendency of an animal to revert to fixed action patterns
autoshaping: the innate tendency to engage in behavior associated with food without receiving a reinforcement
learning occurs on a continuum of preparedness
somethings are learned with ease, while others are difficult to learn
animals that come to learning situations genetically prepared, for example, humans fear snakes over flowers.
animals that come to learning situations unprepared: learning proceeds slowly and steadily (no prior knowledge, not genetically prepared)
animals that come to learning situations contraprepared: learning proceeds slowly and irregularly