Lecture 8: operant conditioning

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/32

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

33 Terms

New cards

action-outwork associations

-operant conditioning

-animal has to do something to generate a certain response → not a hardwired response

New cards

operant conditioning

-unconditioned stimulus is contingent on behaviour of an animal

-learning of action-outcome associations

-action → more general than the responses in classical conditioning, e.g., pressing a lever

-operant behaviour → under stimulus control, so that the action can be a response to a certain stimulus

-outcome can be a reinforcement or punishment

New cards

Thorndike Law of Effect

-responses that create a typically pleasant outcome in a particular situation are more likely to occur again in a similar situation

-responses that produce a typically unpleasant outcome are less likely to occur again in the situation

New cards

Skinner box

-environment in which an animal can learn stimulus-response outcomes

-can have lights, speakers, lever for responses

food dispenser -> appetitive stimuli - can control rates of rewards
electrified grid → aversive stimuli

New cards

reinforcer

-an event that increases the likelihood of the action

New cards

punishers

-an event that decreases the likelihood of the action

New cards

positive reinforcer

-add or increase a pleasant stimulus to strengthen a behaviour

New cards

negative reinforcer

-reduce or remove an unpleasant stimulus to strengthen a behaviour

New cards

positive punishment

-present or add an unpleasant stimulus to weaken a behaviour

New cards

negative punishment

-remove a pleasant stimulus to weaken a behaviour

New cards

reinforcement vs punishment

reinforcement increases behaviour and more beneficial than punishment
more likely to result in long-term changes to behaviour → punishment causes temporary changes as it is based on coercion
reinforcement creates a positive relationship with the person providing the reinforcement → punishment creates an adversarial relationship
when the punisher leaves the unwanted behaviour returns

New cards

continuous reinforcement

-rewarding the behaviour every time

-very quick acquisition and learning

-but rapid extinction when the reward is no longer present

New cards

partial reinforcement

-intersperse trials where the CS is not followed by the US

-done randomly so that the CS is followed by the US with a certain probability

-slows down acquisition and extinction learning

New cards

reinforcement schedules (partial reinforcement)

-partial reinforcement schedules → responses are sometimes reinforced and sometimes not

-slower initial learning but greater resistance to extinction

-reinforcement does not appear after every behaviour → takes longer for learner to determine a lack of reward

New cards

fixed ratio (reinforcement schedule)

-behaviour is reinforced after a specific number of responses

New cards

variable ratio (reinforcement schedule)

-behaviour is reinforced after an average, but unpredictable number of responses

New cards

fixed interval (reinforcement schedule)

-behaviour is reinforced for the first response after a specific amount of time has passed

New cards

variable interval (reinforcement schedule)

-behaviour is reinforced for the first response after an average, but after an unpredictable amount of time has passed

New cards

pattern and number of responses (fixed ratio schedule)

-number of responses required for reinforcement describes the schedule

-probability of reinforcement increases with successive responses

-brief pause in responses after each reinforcement before responses begin again

-stair-step pattern

New cards

pattern and number of responses (variable ratio schedule)

-responding reinforced after a randomly determined number of responses have been emitted

-rate of responding is typically faster than fixed ratio

-response rates relatively constant over time

New cards

pattern and number of responses (fixed interval schedule)

-first response after a designated amount of time is followed by reinforcement

-every 60s give reinforcement

-produce characteristic patten of responding observable across species

-followed by slow rates of responding and and high rates of responding towards the end of the interval

New cards

pattern and number of responses (variable interval)

-responding reinforced after a randomly determined amount of time

-average of 60s between reinforcements but individual intervals will differ from one another

-relatively constant

-most commonly used schedule → produces steady, predictable performance

New cards

shaping

-process of guiding behaviour to the desired outcome through the use of immediate stages

New cards

process of shaping

-dividing the learning goal into subgoals/smaller steps

-reinforcing individual steps rather than the complete goal

-takes time → allows learning of complex sequences

New cards

Skinner - shaping

-successive approximations to create new behaviour

-start with definition of target behaviour and approximations are systematically reinforced

-widely used in applied settings

-reinforcers must quickly follow the desired response

New cards

conditioned reinforcers/secondary reinforcers (shaping)

-decrease delay between behaviour and the delivery of primary reinforcer

-neutral stimuli becomes a reinforcer after being paired with primary reinforcer

-use social reinforcers in applied settings with people

New cards

Rose - condition 1 (effect of reward magnitude on learning)

-pigeons pecking one key will provide a large reward

-pecking the other key will provide a small reward

New cards

Rose - condition 2 (effect of reward magnitude on learning)

-pigeons must peck the correct key to get a reward

-receive a small reward for pecking the blue stimulus

-pecking the wrong key will result in the lights in the box going out

New cards

Rose - condition 3 (effect of reward magnitude on learning)

-pigeon must peck the green key to receive a small reward

-selecting the red key provides no reward → lights go out

New cards

Rose - results (effects of reward magnitude on learning)

-big rewards leads to faster learning across sessions

-small rewards leads to slower learning

-both end up at approximate same rate of performance by 10th session

New cards

neural basis of classical conditioning

-dopamine

-responsive in terms of learned and conditioned responses

New cards

Schultz - method (dopamine and classical conditioning)

-direct electrode measures of dopamine during classical conditioning

-each dot represents firing of a neuron over multiple trials

-spikes indicate heavy activity or good response rate to observed stimulus

-R indicates when reward has been delivered

-CS lets monkey know when reward is coming

New cards

Schultz - results (dopamine and classical conditioning)

-before conditioning when presented with US there is increased activity in response to reward

-after consistent pairing of CS and US → response occurs at CS and not at reward

-when there is no reward the response to the stimulus decreased activity when reward should occur