1/32
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
action-outwork associations
-operant conditioning
-animal has to do something to generate a certain response → not a hardwired response
operant conditioning
-unconditioned stimulus is contingent on behaviour of an animal
-learning of action-outcome associations
-action → more general than the responses in classical conditioning, e.g., pressing a lever
-operant behaviour → under stimulus control, so that the action can be a response to a certain stimulus
-outcome can be a reinforcement or punishment
Thorndike Law of Effect
-responses that create a typically pleasant outcome in a particular situation are more likely to occur again in a similar situation
-responses that produce a typically unpleasant outcome are less likely to occur again in the situation
Skinner box
-environment in which an animal can learn stimulus-response outcomes
-can have lights, speakers, lever for responses
food dispenser -> appetitive stimuli - can control rates of rewards
electrified grid → aversive stimuli
reinforcer
-an event that increases the likelihood of the action
punishers
-an event that decreases the likelihood of the action
positive reinforcer
-add or increase a pleasant stimulus to strengthen a behaviour
negative reinforcer
-reduce or remove an unpleasant stimulus to strengthen a behaviour
positive punishment
-present or add an unpleasant stimulus to weaken a behaviour
negative punishment
-remove a pleasant stimulus to weaken a behaviour
reinforcement vs punishment
reinforcement increases behaviour and more beneficial than punishment
more likely to result in long-term changes to behaviour → punishment causes temporary changes as it is based on coercion
reinforcement creates a positive relationship with the person providing the reinforcement → punishment creates an adversarial relationship
when the punisher leaves the unwanted behaviour returns
continuous reinforcement
-rewarding the behaviour every time
-very quick acquisition and learning
-but rapid extinction when the reward is no longer present
partial reinforcement
-intersperse trials where the CS is not followed by the US
-done randomly so that the CS is followed by the US with a certain probability
-slows down acquisition and extinction learning
reinforcement schedules (partial reinforcement)
-partial reinforcement schedules → responses are sometimes reinforced and sometimes not
-slower initial learning but greater resistance to extinction
-reinforcement does not appear after every behaviour → takes longer for learner to determine a lack of reward
fixed ratio (reinforcement schedule)
-behaviour is reinforced after a specific number of responses
variable ratio (reinforcement schedule)
-behaviour is reinforced after an average, but unpredictable number of responses
fixed interval (reinforcement schedule)
-behaviour is reinforced for the first response after a specific amount of time has passed
variable interval (reinforcement schedule)
-behaviour is reinforced for the first response after an average, but after an unpredictable amount of time has passed
pattern and number of responses (fixed ratio schedule)
-number of responses required for reinforcement describes the schedule
-probability of reinforcement increases with successive responses
-brief pause in responses after each reinforcement before responses begin again
-stair-step pattern
pattern and number of responses (variable ratio schedule)
-responding reinforced after a randomly determined number of responses have been emitted
-rate of responding is typically faster than fixed ratio
-response rates relatively constant over time
pattern and number of responses (fixed interval schedule)
-first response after a designated amount of time is followed by reinforcement
-every 60s give reinforcement
-produce characteristic patten of responding observable across species
-followed by slow rates of responding and and high rates of responding towards the end of the interval
pattern and number of responses (variable interval)
-responding reinforced after a randomly determined amount of time
-average of 60s between reinforcements but individual intervals will differ from one another
-relatively constant
-most commonly used schedule → produces steady, predictable performance
shaping
-process of guiding behaviour to the desired outcome through the use of immediate stages
process of shaping
-dividing the learning goal into subgoals/smaller steps
-reinforcing individual steps rather than the complete goal
-takes time → allows learning of complex sequences
Skinner - shaping
-successive approximations to create new behaviour
-start with definition of target behaviour and approximations are systematically reinforced
-widely used in applied settings
-reinforcers must quickly follow the desired response
conditioned reinforcers/secondary reinforcers (shaping)
-decrease delay between behaviour and the delivery of primary reinforcer
-neutral stimuli becomes a reinforcer after being paired with primary reinforcer
-use social reinforcers in applied settings with people
Rose - condition 1 (effect of reward magnitude on learning)
-pigeons pecking one key will provide a large reward
-pecking the other key will provide a small reward
Rose - condition 2 (effect of reward magnitude on learning)
-pigeons must peck the correct key to get a reward
-receive a small reward for pecking the blue stimulus
-pecking the wrong key will result in the lights in the box going out
Rose - condition 3 (effect of reward magnitude on learning)
-pigeon must peck the green key to receive a small reward
-selecting the red key provides no reward → lights go out
Rose - results (effects of reward magnitude on learning)
-big rewards leads to faster learning across sessions
-small rewards leads to slower learning
-both end up at approximate same rate of performance by 10th session
neural basis of classical conditioning
-dopamine
-responsive in terms of learned and conditioned responses
Schultz - method (dopamine and classical conditioning)
-direct electrode measures of dopamine during classical conditioning
-each dot represents firing of a neuron over multiple trials
-spikes indicate heavy activity or good response rate to observed stimulus
-R indicates when reward has been delivered
-CS lets monkey know when reward is coming
Schultz - results (dopamine and classical conditioning)
-before conditioning when presented with US there is increased activity in response to reward
-after consistent pairing of CS and US → response occurs at CS and not at reward
-when there is no reward the response to the stimulus decreased activity when reward should occur