Operant Conditioning:

Definition: A type of associative learning where behaviors are influenced by the consequences that follow them.
Key Concepts:
- Reinforcement: Increases the likelihood of a behavior. Can be positive (adding a pleasant stimulus) or negative (removing an unpleasant stimulus).
- Punishment: Decreases the likelihood of a behavior. Can be positive (adding an unpleasant stimulus) or negative (removing a pleasant stimulus).
Examples: A rat pressing a lever to receive food (positive reinforcement) or avoiding a shock by pressing a lever (negative reinforcement).

1. What can a reinforcer be?

A reinforcer is something that increases the likelihood of a behavior happening again. There are two main types of reinforcers:

Primary Reinforcers: These are naturally rewarding because they satisfy basic needs, like food, water, or sleep.
Secondary Reinforcers: These are things that we learn to value, like money, praise, or tokens. They don't satisfy basic needs directly but can be used to get primary reinforcers

2. The effect of reinforcement schedules on behaviour

Reinforcement schedules are rules about how often and when reinforcement is given. They can affect how strong and lasting a behavior is. There are two main types of schedules:

Ratio Schedules: Reinforcement is given after a certain number of responses. For example:
- Fixed Ratio (FR): Reinforcement after a set number of responses (e.g., every 5th response).
- Variable Ratio (VR): Reinforcement after a varying number of responses, but it averages out to a certain number (e.g., on average every 5th response, but it could be after 3, 7, etc.).
Interval Schedules: Reinforcement is given after a certain amount of time has passed. For example:
- Fixed Interval (FI): Reinforcement after a set amount of time (e.g., every 5 minutes).
- Variable Interval (VI): Reinforcement after varying amounts of time, but it averages out to a certain period (e.g., on average every 5 minutes, but it could be after 3, 7, etc.)

3. How does it change behaviour?

Reinforcement can change behavior by making it more likely to happen again. Here are some key points:

Positive Reinforcement: Adding something pleasant to increase a behavior (e.g., giving a treat for doing homework).
Negative Reinforcement: Removing something unpleasant to increase a behavior (e.g., turning off a loud noise when a task is completed).
Differential Reinforcement: This involves reinforcing a specific behavior while not reinforcing (or reinforcing less)

Differential Reinforcement Procedures

Differential reinforcement involves reinforcing a specific behavior while not reinforcing (or reinforcing less) other behaviors. There are four main types:

Differential Reinforcement of Low rates of responding (DRL):
- This procedure is used to reduce the frequency of a behavior but not eliminate it. For example, a teacher might reward a child for washing their hands only once before lunch, rather than multiple times
Differential Reinforcement of Incompatible behavior (DRI):
- This involves reinforcing a behavior that is incompatible with the unwanted behavior. For instance, a teacher might reward a child for staying seated, which is incompatible with the behavior of wandering around the classroom
Differential Reinforcement of Alternative behavior (DRA):
- This procedure reinforces an alternative behavior that is not necessarily incompatible with the unwanted behavior. For example, parents might ignore a child's demands but respond positively when the child asks politely
Differential Reinforcement of Other behavior (DRO):
- This involves reinforcing the absence of the unwanted behavior for a specific period. For example, a child might be rewarded for not engaging in disruptive behavior during a class period

Operationalising the Use of Reinforcers and Discrimination Training

Operationalising the use of reinforcers involves specifying how and when reinforcers will be used to modify behavior. Key factors include:

Magnitude of Reinforcement: The size or amount of the reinforcer.
Delay of Reinforcement: The time between the behavior and the delivery of the reinforcer.
Contingency: The relationship between the behavior and the reinforcer, ensuring that the reinforcer is given only if the desired behavior occurs

Premack's Principle, proposed by David Premack in 1965, suggests that more probable behaviors (those that an organism frequently engages in when given the opportunity) can reinforce less probable behaviors (those that the organism is less likely to do on their own). This principle is often summarized as "high probability behavior reinforces low probability behavior." For example, if a child enjoys playing video games (a high probability behavior) more than doing homework (a low probability behavior), allowing the child to play video games after completing their homework can reinforce the homework behavior.

PASS PSYU2236 Week 10 Worksheet

Terminology

Q: Define the following terms

Term	Definition
Shaping	Reinforcing behaviour over multiple waves with each wave getting closer to the desired behaviour
Reinforcement	A consequence to a response that increases the likelihood of engaging in that behaviour
Punishment	A consequence to a response that decreases the likelihood of engaging in that behaviour
Establishing Operations	Factors that affect the effectives of a reinforcer

Theories and Principles

Q: Describe the following theories and principles

Theory/Principle	Explanation
Thorndike’s Law of Effect	A response that is followed by a reinforcer will be associated with that reinforcer, altering the likelihood of engaging that behaviour
Goal Gradient Hypothesis	Organisms will engage in a desired behaviour the closer they are to achieving a goal
Premack Principle	Desirable activities can be used to reinforce undesirable activities

Reinforcement and Punishment

Q: Explain how each of the following reinforcement/punishment types differ and the effect that each has on behaviour

Type	Explanation and effect on behaviour
Positive Reinforcement	Involves adding a pleasant stimulus to the environment which leads to an increase in behaviour
Negative Reinforcement	Involves removing an unpleasant stimulus to the environment which leads to an increase in behaviour
Positive Punishment	Involves adding an unpleased stimulus to the environment which leads to a decrease in behaviour
Negative Punishment	Involves removing a pleasant stimulus to the environment which leads to a decrease in behaviour

Q: What is the difference between primary and secondary reinforcers?

A: Primary reinforcers have a direct biological/survival value (similar to an unconditioned stimulus), where as secondary reinforcers are stimuli that are associated with something of biological significance (similar to a conditioned stimulus)

Q: What effect can the following situations have on the effectiveness of a reinforcer?

Situations	Effect
Increasing the magnitude of a reinforcer	Increases the effectiveness of a reinforcer
Introducing a delay before reinforcement	Decreases the effectiveness of a reinforcer
Decreasing the consistency of reinforcement	Decreases the effectiveness

Q: What can mitigate the effects of delay of reinforcement on reinforcer effectiveness?

A: Using a bridging cue, such as a tone that signals that the response has been registered

Reinforcement Schedule

Q: Describe a continuous reinforcement schedule? What are the strengths and limitations of using a continuous reinforcement?

A: A reinforcement schedule where a behaviour is reinforced every time the desired behaviour is performed. While it is great at establishing new behaviours, it is not very resistant to extinction

Q: Explain the difference between ratio and interval schedules

A: Ratio schedules are based on the number of times that an organism engages in the desired behaviour, while interval schedules are based on the amount of time between desired behaviours

Q: Explain the difference between fixed and variable schedules

A: Fixed schedules have a ratio or time interval that does not change, while variable schedules do not change

Q: Explain the effectiveness of the following reinforcement schedules regarding acquisition and extinction

Schedule	Effect on Acquisition	Effect on Extinction
Fixed Ratio	Relatively good acquisition	Low resistance to extinction
Fixed Interval	Has the lowest rate of responding. Responding is scalloped, being very fast when the interval is coming to an end and slower after the interval has ended	Low resistance to extinction
Variable Ratio	Very fast acquisition	Very resistant to extinction
Variable Interval	Very slow rate of responding	High resistance to extinction

Q: Describe the following phenomena as they relate to reinforcement schedules in operant conditioning

Phenomena	Description
Ratio Strain	A decrease in engagement in the desired behavior as a result of an abrupt increase in the number of required responses
Ratio Run	An increase in engagement in the desired behaviour as the organism gets close to completing the required number of responses

Differential Reinforcement

Q: Describe the following differential reinforcement (DR) schedules

Schedule	Description
DR of Other Behaviour	Reinforcing the absence of the undesired behaviour
DR of Low Rates	Reinforcing lower rates of the undesired behaviour
DR of Incompatible Behaviour	Reinforcing a behaviour that is incompatible to the undesired behaviour that serves a similar function
DR of Alternative Behaviour	Reinforcing alternative to the undesired behaviour that serves the same function

W11: Operant Conditioning

Test Your Knowledge Quiz

Q1: True or False? Classical conditioning involves voluntary responses whereas operant conditioning involves involuntary responses.

Cc: e.g. reflexive, bell = salivate

Oc: reward/punishment, associated w/consequences

A1:

a) True

b) False

Q2: When something is removed from the environment that causes the behaviour to increase in frequency, that something must have been…

*negative reinforcement, taking away something bad, e.g., Panadol for headache

A2:

a) pleasant

b) unpleasant

c) neutral

Q3: Which of the following is the core principle of operant conditioning? Behaviour is associated with:

a- "Contingency" is the dependence of the consequence on the behaviour

b-"Consistency" is about the application of consequences rather than a direct relationship between behaviour and consequence.

A3:

a) contingency

b) consistency

c) consequences

Q4: Within the operant conditioning paradigm, ‘positive’ means:

*positive reinforcement

*positive punishment

A4:

a) something desirable is added to the environment

b) something undesirable is added to the environment

c) something is added to the environment

Q5: Which operant conditioning technique involves reinforcing successive approximations of a desired behaviour?

(E.g.., cats w/ bell)

b-linking a series of behaviours together to form a complex behaviour

c-gradually removing or reducing prompts or cues

A5:

a) shaping

b) chaining

c) fading

Q6: A teacher wants to encourage students to raise their hands before speaking in class. Which of the following is an example of Differential Reinforcement of Incompatible behaviour (DRI)?

a) DRA, reinforcing an alternative behaviour (raising hands) rather than an incompatible one

b) DRL, as it's attempting to reduce the frequency of speaking out without raising hands

c) DRI, keeping hands in laps is incompatible with speaking out of turn, and reinforcing this behaviour would indirectly reduce the undesired behaviour

A6:

a) The teacher ignores students who speak out of turn but calls on those who raise their hands

b) The teacher rewards students who speak less than three times without raising their hand during the lesson

c) The teacher gives praise to students who keep their hands in their laps when not speaking

Q7: What is the behavioural outcome of positive punishment?

Eg., something unpleasant is added to the environment, adding chores.

A7:

a) decrease in desirable behaviour

b) decrease in undesirable behaviour

c) increase in undesirable behaviour

Q8: Which of the following operant conditioning consequences is often centred around avoidance learning principles?

e.g., sunscreen, medicine, etc

A8:

a) positive punishment

b) negative punishment

c) negative reinforcement

Q9: Regarding negative punishment: Something is ______ from the environment, that causes the behaviour

to decrease in frequency ∴ that something must have been _______

A9:

a) removed, unpleasant

b) removed, pleasant

c) added, unpleasant

Q10: Which emotion is primarily associated with positive punishment?

b- negative punishment

c- negative reinforcement

A10:

a) fear

b) anger

c) relief

Q11: Which schedule of reinforcement is excellent for starting a new behaviour but stops quickly when reinforcement stops?

A11:

a) partial schedule

b) continuous schedule

c) fixed ratio schedule

Q12: Fixed-interval schedule has _____ extinction resistance whereas variable-ratio schedule has _____

A12:

a) neutral, progressively lower

b) high, low

c) low, high

Q13: Would moderate increases in the fixed-ratio schedule result in a faster or slower response rate?

*incremental increase between sessions

*Caveat: ratio strain; no. of responses too high → a disruption/decrease

A13:

a) faster

b) slower

Q14: Which schedule has the lowest rate of responding?

A14:

a) fixed-ratio

b) fixed-interval

c) variable-interval

Q15: Are ratio or interval responses typically fastest?

a-reinforcement is tied directly to the number of responses, which encourages rapid responding.

b-reinforcement is tied to the passage of time rather than the number of responses

A15:

a) ratio

b) interval

Q16: Which reinforcer are the rats more likely to run faster for? (Christopher, 1988)

A16:

a) same amount compiled into one large piece

b) same amount broken into smaller pieces

c) same amount broken in half

Q17: How might the deleterious effects of delay be prevented?

E.g. clicker

A17:

a) by providing a progressively greater reward

b) by providing a signal that the reward is coming

c) by providing small rewards during each delay

Q18: Which factor makes internet porn, modern pokies, and social media so addictive?

A18:

a) magnitude

b) speed

c) contiguity

Q19: Why is money reinforcing?

A19:

a) its reinforcing properties extend to all organisms

b) it is considered a primary reinforcer

c) it serves as a means for primary reinforcers

Q20: Which of the following best describes the Premack Principle?

E.g. A child who loves playing video games (high probability) but doesn't like doing homework (low probability).

The principle suggests: They can play video games (what they like) after they finish their homework (what they don't like).

A20:

a) high probability behaviour reinforces low probability behaviour

b) low probability behaviour reinforces high probability behaviour

c) high probability behaviour is only reinforced by primary reinforcers

d) high probability behaviour is only reinforced by secondary reinforcers

STUDY TIPS:

® Start collating notes, summarise and simplify them.

® Tables, diagrams, mind maps can be useful visual mnemonics.

® Use acronyms and number associations. For example, remembering that there are 4 brain lobes can help you recall their names.