1/27
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Operant conditioning
Behaviour shaped by the learner’s history of experience rewards and punishments for their actions
According to Skinner - our behaviours are shaped by our history of experiencing rewards and punishments as consequences
Reinforcement
A behaviour is reinforced (strengthen) whenever a desirable outcome is the consequence.
Behaviours that are reinforced are more likely to be repeated.
Reinforcer
Any consequence of a behaviour that makes that behaviour more likely to recur in future.
Reinforcers can be either positive (+) or negative (-)
Positive reinforcer for positive reinforcement
Something pleasant that is added to increase behaviour
Eg food, lollies, treats, etc
Negative reinforcer for Negative reinforcement
Something unpleasant that is removed to increase behaviour
Eg. Electric shocks
Positive reinforcement
Learn to reproduce a behaviour if the consequence is receiving something pleasant
Rat press lever: receiving a rewarding consequence
Negative reinforcement
Learn to reproduce a behaviour if the consequence is that something unpleasant will stop
Rat press lever: to terminate an unpleasant stimulus as a consequence, such as the electric shock
Continuous reinforcement
Rarely occurs in natural environments, leads to rapid extinction once the reinforcer is withheld
Partial reinforcement
Leads to more persistent learning because the learner becomes accustomed to reinforcement occurring on some occasions and not others.
Extinction of reinforcement
Extinction of an operantly conditioned behaviour occurs when reinforcement is withheld.
Not immediate - sometimes there is a brief increase in responding referred to as an extinction burst followed by a decrease in trained behaviour.
Shaping of complex behaviours
Reinforces successive approximations to the desired behaviour (reinforcing small steps)
Start by reinforcing a high frequency component of the desired response
Then drop this reinforcement - behaviour becomes more variable again
Await a response that is still closer to the desired response - then reintroduce the reinforcer
Keep cycling through as close and closer approximations to the desired behaviour are achieved
Enables the moulding of a response that is not normally part of an animal’s repertoire
Punishment
A behaviour is punished (weakened) whenever the learner experiences an undesirable consequence for that behaviour
Behaviours that are followed by punishment are less likely to be repeated
Punisher
Is any consequence of behaviour that makes that behaviour less likely to recur in future, can also be either positive (+) or negative (-)
Positive punisher for positive punishment
An inherently unpleasant stimulus (UCS) that weakens behaviour when added as consequence of the behaviour
Eg shocked, spanked
Negative punisher for negative punishment
A pleasant stimulus that weakens behaviour when removed as a consequence of the behaviour
Eg phone taken away
Positive punishment
An animal will stop producing a behaviour if the consequence is the presentation of an unpleasant stimulus
Eg shocked
Negative punishment
An animal will stop producing behaviour if the consequence is that something desirable (UCS) is taken away
Antecedent stimuli
‘Cue’ that signals the availability of a reinforcer.
Antecedent-reinforcer relationship is based on a classically conditioned CS-UCS association
Classically conditioned associations become cues for operant behaviours
Eg. The sight of my mobile-phone is associated with the rewarding consequences of scrolling through social media: the phone becomes a cue (antecedent) for the voluntary behaviour of scrolling social media and its attendant rewards
Antecedent stimuli drive habitual behaviours:
The sight of my favourite cafe is associated with the rewarding consequences of my morning coffee
The cafe is an antecedent for the behaviour of buying a coffee
The sign for the pokies is an antecedent for gambling behaviour. It is associated with the rewards of winning
“ABC” model of operant conditioning
Antecedent → Behaviour → Consequence
Discriminant stimuli
An antecedent becomes a discriminative stimulus when it signals which of two or more behaviours will be rewarded in a particular context.
Eg. Swearing is punished in some contexts but is associated with rewarding outcomes in others - the context allows us to discriminate between situations associated with rewards or punishments for a particular behaviour
In Skinner box, a green light may signal food availability whereas a red light may signal impending foot-shock
Receiving the food or avoiding the foot-shock may then be contingent on pressing a lever OR moving to the opposite side of the cage, respectively
NOTE: the discriminant stimulus-reward/punisher relationship is based on a classically conditioned CS-UCS association (related to the process of stimulus discrimination in CC)
Other examples:
Animal training involves learning discriminant signals for different behaviours
different hand signals and/or verbal command signals which behaviour to produce for a reward
Skinner taught pigeons to turn circles counter-clockwise to receive a reward in one box, and clockwise to receive a reward in another box
the pigeons learned that each box provided a distinct discriminant stimulus for each behaviour
Studying operant conditioning: The Skinner Box
Skinner could control the animal’s experience of reinforcement and punishment.
Reward: pellet dispenser releases food
Punishment: electric grid shocks mouse
Example: the mouse might receive a food pellet each time it presses the lever (positive reinforcement). Or it might be consequence of lever pressing to terminate an unpleasant consequence (negatively reinforcement)
When is punishment effective? The three Cs
Contingency, Contiguity and Consistency
Contingency
The relationship between the behaviour and the punisher must be clear
Contiguity
The punisher must follow the behaviour swiftly
Consistency
The punisher needs to occur for every occurrence of the behaviour
Drawbacks of punishment
Positive punishment rarely works for long-term behaviour change - tends to only suppress behaviour
Does not teach a more desirable outcome
Produces negative feelings in the learner, which do not promote new learning
Harsh punishment may teach the learner to use such behaviour towards other (social learning)
If the threat of punishment is removed, the behaviour returns. Why?
Alternative to punishment
Stop reinforcing the problem behaviour (extinction)
Reinforce an alternative behaviour that is both constructive and incompatible with the undesirable behaviour
Reinforce the non-occurrence of the undesirable behaviour