Operant Conditioning:
Definition: A type of associative learning where behaviors are influenced by the consequences that follow them.
Key Concepts:
Reinforcement: Increases the likelihood of a behavior. Can be positive (adding a pleasant stimulus) or negative (removing an unpleasant stimulus).
Punishment: Decreases the likelihood of a behavior. Can be positive (adding an unpleasant stimulus) or negative (removing a pleasant stimulus).
Examples: A rat pressing a lever to receive food (positive reinforcement) or avoiding a shock by pressing a lever (negative reinforcement).
1. What can a reinforcer be?
A reinforcer is something that increases the likelihood of a behavior happening again. There are two main types of reinforcers:
Primary Reinforcers: These are naturally rewarding because they satisfy basic needs, like food, water, or sleep.
Secondary Reinforcers: These are things that we learn to value, like money, praise, or tokens. They don't satisfy basic needs directly but can be used to get primary reinforcers
2. The effect of reinforcement schedules on behaviour
Reinforcement schedules are rules about how often and when reinforcement is given. They can affect how strong and lasting a behavior is. There are two main types of schedules:
Ratio Schedules: Reinforcement is given after a certain number of responses. For example:
Fixed Ratio (FR): Reinforcement after a set number of responses (e.g., every 5th response).
Variable Ratio (VR): Reinforcement after a varying number of responses, but it averages out to a certain number (e.g., on average every 5th response, but it could be after 3, 7, etc.).
Interval Schedules: Reinforcement is given after a certain amount of time has passed. For example:
Fixed Interval (FI): Reinforcement after a set amount of time (e.g., every 5 minutes).
Variable Interval (VI): Reinforcement after varying amounts of time, but it averages out to a certain period (e.g., on average every 5 minutes, but it could be after 3, 7, etc.)
3. How does it change behaviour?
Reinforcement can change behavior by making it more likely to happen again. Here are some key points:
Positive Reinforcement: Adding something pleasant to increase a behavior (e.g., giving a treat for doing homework).
Negative Reinforcement: Removing something unpleasant to increase a behavior (e.g., turning off a loud noise when a task is completed).
Differential Reinforcement: This involves reinforcing a specific behavior while not reinforcing (or reinforcing less)
Differential Reinforcement Procedures
Differential reinforcement involves reinforcing a specific behavior while not reinforcing (or reinforcing less) other behaviors. There are four main types:
Differential Reinforcement of Low rates of responding (DRL):
This procedure is used to reduce the frequency of a behavior but not eliminate it. For example, a teacher might reward a child for washing their hands only once before lunch, rather than multiple times
Differential Reinforcement of Incompatible behavior (DRI):
This involves reinforcing a behavior that is incompatible with the unwanted behavior. For instance, a teacher might reward a child for staying seated, which is incompatible with the behavior of wandering around the classroom
Differential Reinforcement of Alternative behavior (DRA):
This procedure reinforces an alternative behavior that is not necessarily incompatible with the unwanted behavior. For example, parents might ignore a child's demands but respond positively when the child asks politely
Differential Reinforcement of Other behavior (DRO):
This involves reinforcing the absence of the unwanted behavior for a specific period. For example, a child might be rewarded for not engaging in disruptive behavior during a class period
Operationalising the Use of Reinforcers and Discrimination Training
Operationalising the use of reinforcers involves specifying how and when reinforcers will be used to modify behavior. Key factors include:
Magnitude of Reinforcement: The size or amount of the reinforcer.
Delay of Reinforcement: The time between the behavior and the delivery of the reinforcer.
Contingency: The relationship between the behavior and the reinforcer, ensuring that the reinforcer is given only if the desired behavior occurs
Premack's Principle, proposed by David Premack in 1965, suggests that more probable behaviors (those that an organism frequently engages in when given the opportunity) can reinforce less probable behaviors (those that the organism is less likely to do on their own). This principle is often summarized as "high probability behavior reinforces low probability behavior." For example, if a child enjoys playing video games (a high probability behavior) more than doing homework (a low probability behavior), allowing the child to play video games after completing their homework can reinforce the homework behavior.
PASS PSYU2236 Week 10 Worksheet
Terminology
Q: Define the following terms
Term | Definition |
Shaping | Reinforcing behaviour over multiple waves with each wave getting closer to the desired behaviour |
Reinforcement | A consequence to a response that increases the likelihood of engaging in that behaviour |
Punishment | A consequence to a response that decreases the likelihood of engaging in that behaviour |
Establishing Operations | Factors that affect the effectives of a reinforcer |
Theories and Principles
Q: Describe the following theories and principles
Theory/Principle | Explanation |
Thorndike’s Law of Effect | A response that is followed by a reinforcer will be associated with that reinforcer, altering the likelihood of engaging that behaviour |
Goal Gradient Hypothesis | Organisms will engage in a desired behaviour the closer they are to achieving a goal |
Premack Principle | Desirable activities can be used to reinforce undesirable activities |
Reinforcement and Punishment
Q: Explain how each of the following reinforcement/punishment types differ and the effect that each has on behaviour
Type | Explanation and effect on behaviour |
Positive Reinforcement | Involves adding a pleasant stimulus to the environment which leads to an increase in behaviour |
Negative Reinforcement | Involves removing an unpleasant stimulus to the environment which leads to an increase in behaviour |
Positive Punishment | Involves adding an unpleased stimulus to the environment which leads to a decrease in behaviour |
Negative Punishment | Involves removing a pleasant stimulus to the environment which leads to a decrease in behaviour |
Q: What is the difference between primary and secondary reinforcers?
A: Primary reinforcers have a direct biological/survival value (similar to an unconditioned stimulus), where as secondary reinforcers are stimuli that are associated with something of biological significance (similar to a conditioned stimulus)
Q: What effect can the following situations have on the effectiveness of a reinforcer?
Situations | Effect |
Increasing the magnitude of a reinforcer | Increases the effectiveness of a reinforcer |
Introducing a delay before reinforcement | Decreases the effectiveness of a reinforcer |
Decreasing the consistency of reinforcement | Decreases the effectiveness |
Q: What can mitigate the effects of delay of reinforcement on reinforcer effectiveness?
A: Using a bridging cue, such as a tone that signals that the response has been registered
Reinforcement Schedule
Q: Describe a continuous reinforcement schedule? What are the strengths and limitations of using a continuous reinforcement?
A: A reinforcement schedule where a behaviour is reinforced every time the desired behaviour is performed. While it is great at establishing new behaviours, it is not very resistant to extinction
Q: Explain the difference between ratio and interval schedules
A: Ratio schedules are based on the number of times that an organism engages in the desired behaviour, while interval schedules are based on the amount of time between desired behaviours
Q: Explain the difference between fixed and variable schedules
A: Fixed schedules have a ratio or time interval that does not change, while variable schedules do not change
Q: Explain the effectiveness of the following reinforcement schedules regarding acquisition and extinction
Schedule | Effect on Acquisition | Effect on Extinction |
Fixed Ratio | Relatively good acquisition | Low resistance to extinction |
Fixed Interval | Has the lowest rate of responding. Responding is scalloped, being very fast when the interval is coming to an end and slower after the interval has ended | Low resistance to extinction |
Variable Ratio | Very fast acquisition | Very resistant to extinction |
Variable Interval | Very slow rate of responding | High resistance to extinction |
Q: Describe the following phenomena as they relate to reinforcement schedules in operant conditioning
Phenomena | Description |
Ratio Strain | A decrease in engagement in the desired behavior as a result of an abrupt increase in the number of required responses |
Ratio Run | An increase in engagement in the desired behaviour as the organism gets close to completing the required number of responses |
Differential Reinforcement
Q: Describe the following differential reinforcement (DR) schedules
Schedule | Description |
DR of Other Behaviour | Reinforcing the absence of the undesired behaviour |
DR of Low Rates | Reinforcing lower rates of the undesired behaviour |
DR of Incompatible Behaviour | Reinforcing a behaviour that is incompatible to the undesired behaviour that serves a similar function |
DR of Alternative Behaviour | Reinforcing alternative to the undesired behaviour that serves the same function |
W11: Operant Conditioning
|
Test Your Knowledge Quiz |
Q1: True or False? Classical conditioning involves voluntary responses whereas operant conditioning involves involuntary responses. Cc: e.g. reflexive, bell = salivate Oc: reward/punishment, associated w/consequences | A1: a) True b) False |
Q2: When something is removed from the environment that causes the behaviour to increase in frequency, that something must have been… *negative reinforcement, taking away something bad, e.g., Panadol for headache | A2: a) pleasant b) unpleasant c) neutral |
Q3: Which of the following is the core principle of operant conditioning? Behaviour is associated with: a- "Contingency" is the dependence of the consequence on the behaviour b-"Consistency" is about the application of consequences rather than a direct relationship between behaviour and consequence. | A3: a) contingency b) consistency c) consequences |
Q4: Within the operant conditioning paradigm, ‘positive’ means: *positive reinforcement *positive punishment | A4: a) something desirable is added to the environment b) something undesirable is added to the environment c) something is added to the environment |
Q5: Which operant conditioning technique involves reinforcing successive approximations of a desired behaviour? (E.g.., cats w/ bell) b-linking a series of behaviours together to form a complex behaviour c-gradually removing or reducing prompts or cues | A5: a) shaping b) chaining c) fading |
Q6: A teacher wants to encourage students to raise their hands before speaking in class. Which of the following is an example of Differential Reinforcement of Incompatible behaviour (DRI)? a) DRA, reinforcing an alternative behaviour (raising hands) rather than an incompatible one b) DRL, as it's attempting to reduce the frequency of speaking out without raising hands c) DRI, keeping hands in laps is incompatible with speaking out of turn, and reinforcing this behaviour would indirectly reduce the undesired behaviour
| A6: a) The teacher ignores students who speak out of turn but calls on those who raise their hands b) The teacher rewards students who speak less than three times without raising their hand during the lesson c) The teacher gives praise to students who keep their hands in their laps when not speaking |
Q7: What is the behavioural outcome of positive punishment? Eg., something unpleasant is added to the environment, adding chores. | A7: a) decrease in desirable behaviour b) decrease in undesirable behaviour c) increase in undesirable behaviour |
Q8: Which of the following operant conditioning consequences is often centred around avoidance learning principles? e.g., sunscreen, medicine, etc | A8: a) positive punishment b) negative punishment c) negative reinforcement |
Q9: Regarding negative punishment: Something is ______ from the environment, that causes the behaviour to decrease in frequency ∴ that something must have been _______
| A9: a) removed, unpleasant b) removed, pleasant c) added, unpleasant |
Q10: Which emotion is primarily associated with positive punishment? b- negative punishment c- negative reinforcement | A10: a) fear b) anger c) relief |
Q11: Which schedule of reinforcement is excellent for starting a new behaviour but stops quickly when reinforcement stops? | A11: a) partial schedule b) continuous schedule c) fixed ratio schedule |
Q12: Fixed-interval schedule has _____ extinction resistance whereas variable-ratio schedule has _____ | A12: a) neutral, progressively lower b) high, low c) low, high |
Q13: Would moderate increases in the fixed-ratio schedule result in a faster or slower response rate? *incremental increase between sessions *Caveat: ratio strain; no. of responses too high → a disruption/decrease | A13: a) faster b) slower |
Q14: Which schedule has the lowest rate of responding? | A14: a) fixed-ratio b) fixed-interval c) variable-interval |
Q15: Are ratio or interval responses typically fastest? a-reinforcement is tied directly to the number of responses, which encourages rapid responding. b-reinforcement is tied to the passage of time rather than the number of responses | A15: a) ratio b) interval |
Q16: Which reinforcer are the rats more likely to run faster for? (Christopher, 1988) | A16: a) same amount compiled into one large piece b) same amount broken into smaller pieces c) same amount broken in half |
Q17: How might the deleterious effects of delay be prevented? E.g. clicker | A17: a) by providing a progressively greater reward b) by providing a signal that the reward is coming c) by providing small rewards during each delay |
Q18: Which factor makes internet porn, modern pokies, and social media so addictive? | A18: a) magnitude b) speed c) contiguity |
Q19: Why is money reinforcing? | A19: a) its reinforcing properties extend to all organisms b) it is considered a primary reinforcer c) it serves as a means for primary reinforcers |
Q20: Which of the following best describes the Premack Principle?
E.g. A child who loves playing video games (high probability) but doesn't like doing homework (low probability). The principle suggests: They can play video games (what they like) after they finish their homework (what they don't like). | A20: a) high probability behaviour reinforces low probability behaviour b) low probability behaviour reinforces high probability behaviour c) high probability behaviour is only reinforced by primary reinforcers d) high probability behaviour is only reinforced by secondary reinforcers |
STUDY TIPS:
® Start collating notes, summarise and simplify them.
® Tables, diagrams, mind maps can be useful visual mnemonics.
® Use acronyms and number associations. For example, remembering that there are 4 brain lobes can help you recall their names.
