1/32
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
action-outwork associations
-operant conditioning
-animal has to do something to generate a certain response → not a hardwired response
operant conditioning
-unconditioned stimulus is contingent on behaviour of an animal
-learning of action-outcome associations
-action → more general than the responses in classical conditioning, e.g., pressing a lever
-operant behaviour → under stimulus control, so that the action can be a response to a certain stimulus
-outcome can be a reinforcement or punishment
Thorndike Law of Effect
-responses that create a typically pleasant outcome in a particular situation are more likely to occur again in a similar situation
-responses that produce a typically unpleasant outcome are less likely to occur again in the situation
Skinner box
-environment in which an animal can learn stimulus-response outcomes
-can have lights, speakers, lever for responses
food dispenser -> appetitive stimuli - can control rates of rewards
electrified grid → aversive stimuli
reinforcer
-an event that increases the likelihood of the action
punishers
-an event that decreases the likelihood of the action
positive reinforcer
-add or increase a pleasant stimulus to strengthen a behaviour
negative reinforcer
-reduce or remove an unpleasant stimulus to strengthen a behaviour
positive punishment
-present or add an unpleasant stimulus to weaken a behaviour
negative punishment
-remove a pleasant stimulus to weaken a behaviour
reinforcement vs punishment
reinforcement increases behaviour and more beneficial than punishment
more likely to result in long-term changes to behaviour → punishment causes temporary changes as it is based on coercion
reinforcement creates a positive relationship with the person providing the reinforcement → punishment creates an adversarial relationship
when the punisher leaves the unwanted behaviour returns
continuous reinforcement
-rewarding the behaviour every time
-very quick acquisition and learning
-but rapid extinction when the reward is no longer present
partial reinforcement
-intersperse trials where the CS is not followed by the US
-done randomly so that the CS is followed by the US with a certain probability
-slows down acquisition and extinction learning
reinforcement schedules (partial reinforcement)
-partial reinforcement schedules → responses are sometimes reinforced and sometimes not
-slower initial learning but greater resistance to extinction
-reinforcement does not appear after every behaviour → takes longer for leaner to determine a lack of reward
fixed ratio (reinforcement schedule)
-behaviour is reinforced after a specific number of responses
variable ratio (reinforcement schedule)
-behaviour is reinforced after an average, but unpredictable number of responses
fixed interval (reinforcement schedule)
-behaviour is reinforced for the first response after a specific amount of time has passed
variable interval (reinforcement schedule)
behaviour is reinforced for the first response after an average, but after an unpredictable amount of time has passed
pattern and number of responses (fixed ratio schedule)
-number of responses required for reinforcement describes the schedule
-probability of reinforcement increases with successive responses
-brief pause in responses after each reinforcement before responses begin again
-stair-step pattern
pattern and number of responses (variable ratio schedule)
-responding reinforced after a randomly determined number of responses have been emitted
-rate of responding is typically faster than fixed ratio
-response rates relatively constant over time
pattern and number of responses (fixed interval schedule)
-first response after a designated amount of time is followed by reinforcement
-every 60s give reinforcement
-produce characteristic patten of responding observable across species
-followed by slow rates of responding and and high rates of responding towards the end of the interval
pattern and number of responses (variable interval)
-responding reinforced after a randomly determined amount of time
-average of 60s between reinforcements but individual intervals will differ from one another
-relatively constant
-most commonly used schedule → produces steady, predictable performance
shaping
-process of guiding behaviour to the desired outcome through the use of immediate stages
process of shaping
-dividing the learning goal into subgoals/smaller steps
-reinforcing individual steps rather than the complete goal
-takes time → allows learning of complex sequences
Skinner - shaping
-successive approximations to create new behaviour
-start with definition of target behaviour and approximations are systematically reinforced
-widely used in applied settings
-reinforces must quickly follow the desired response
conditioned reinforcers/secondary reinforcers (shaping)
-decrease delay between behaviour and the delivery of primary reinforcer
-neutral stimuli becomes a reinforcer after being paired with primary reinforcer
-use social reinforcers in applied settings with people
Rose - condition 1 (effect of reward magnitude on learning)
-pigeons pecking one key will provide a large reward
-pecking the other key will provide a small reward
Rose - condition 2 (effect of reward magnitude on learning)
-pigeons must peck the correct key to get a reward
-receive a small reward for pecking the blue stimulus
-pecking the wrong key will result in the lights in the box going out
Rose - condition 3 (effect of reward magnitude on learning)
-pigeon must peck the green key to receive a small reward
-selecting the red key provides no reward → lights go out
Rose - results (effects of reward magnitude on learning)
-big rewards leads to faster learning across sessions
-small rewards leads to slower learning
-both end up at approximate same rate of performance by 10th session
neural basis of classical conditioning
-dopamine
-responsive in terms of learned and conditioned responses
Schultz - method (dopamine and classical conditioning)
-direct electrode measures of dopamine during classical conditioning
-each dot represents firing of a neuron over multiple trials
-spikes indicate heavy activity or good response rate to observed stimulus
-R indicates when reward has been delivered
-CS lets monkey know when reward is coming
Schultz - results (dopamine and classical conditioning)
-before conditioning when presented with US there is increased activity in response to reward
-after consistent pairing of CS and US → response occurs at CS and not at reward
-when there is no reward the response to the stimulus decreased activity when reward should occur