Companion Animal Behaviour: Behaviour Modification

Upon completing this material, you should be able to:

Explain and distinguish between positive reinforcement, negative reinforcement, positive punishment, and negative punishment.
Explain why punishment is considered an indirect means of learning and why it is considered temporary.
Describe how to apply successive approximation as a form of trial and error learning and explain how it relates to selective reinforcement.
Explain how a behaviour can be shaped through chaining.
Correctly identify a heterogeneous and homogeneous chain.
Explain and distinguish between fixed ratio, fixed interval, variable ratio, and variable interval schedules of reinforcement.
Explain why the schedule of reinforcement during the training phase often differs from that during the maintenance phase.
Summarize what is meant by instinctive drift and explain why it is important to consider when the intent is to train a behaviour for the long term in an animal.

Definition: Operant conditioning is a process whereby a particular class of response is shown to be more frequent as a function of the consequences it produces.
B.F. Skinner: Conducted animal behaviour research using the “Skinner Box” to study these principles.

Respondents: Skinner referred to behaviours elicited by specific stimuli as respondents.
Influence of Consequences: Behaviour is profoundly influenced by the consequences it produces.
Manipulation: We can influence the rate of occurrence of behaviours by systematically manipulating these consequences.
This approach works as long as orderly changes occur as we vary the consequences of behaviour.
This includes the use of various schedules of reinforcement.

In operant learning, a response (R) is rewarded (food) in the presence of a discriminative stimulus (S) (e.g., a light).
- Antecedent: Light (Discriminative Stimulus, S)
- Behaviour: Bar pressing or key pecking (Response, R)
- Consequence: Food Reward (Anything chosen as a reward)
This demonstrates the relationship between an antecedent, a behaviour, and its consequence in influencing future behaviour.

Instrumental (operant) learning allows for connecting multiple responses (Rs) to teach animals more complex behaviours.
Example: Teaching a dolphin to jump through a hoop above the water.
- This involves a sequence, starting with simpler behaviours and building up:
  1. Touching a wand
  2. Touching a hoop
  3. Jumping through a hoop

Responding	Consequence	Add	Remove
Increase		Positive Reinforcement	Negative Reinforcement
Decrease		Positive Punishment	Negative Punishment (Omission)

Definition: A reinforcer that produces an increase in the frequency of a desired behaviour by adding a desirable stimulus.
Process: Stimulus (command 'sit') $\rightarrow$ Response (sitting position) $\rightarrow$ Reinforcement (food, petting, praise).
Examples: Food for bar pressing; praise for high test scores to increase studying.
Bridging Stimulus: Due to potential delays between response and reinforcement, a bridging stimulus (e.g., a clicker) can signal that reinforcement is coming.
- This is an anticipatory effect, using a trace temporal paradigm.
- Through repeated pairings (classical conditioning), the bridging stimulus acquires properties of the reinforcing treat and becomes a secondary reinforcer.

Definition: A reinforcer that strengthens a behaviour that removes an aversive or undesirable stimulus, thereby increasing the probability of the response.
Process: Stimulus (person and onset of fear) $\rightarrow$ Response (growl) $\rightarrow$ Reinforcement (termination of fear).
Termination of pain or reduction of fear are examples of negative reinforcers.
Animals learn tasks, such as growling, if these behaviours result in the fear-inducing stimulus going away. This leads to the repetition of the threatening behaviour.
Examples: Withdrawal from a hot stovetop is reinforced by the cessation of discomfort; a rat's bar press is reinforced because it turns off an electric shock.

This distinction is often a source of confusion.
Punisher: Any consequence that decreases the frequency of the behaviour that produces it.
- Example: Delivery of electric shock to decrease the frequency of a bar press.
Punishment: The presentation of an aversive stimulus or removal of a pleasurable stimulus after an undesirable behaviour has occurred. It aims to stop or reduce the likelihood of the behaviour occurring in the future.
- Positive Punishment: Presentation of an aversive stimulus (e.g., electric shock).
- Negative Punishment (Omission): Removal of a pleasurable stimulus (e.g., removing social interaction).

Skinner believed punishers do not directly affect behaviour in the same way reinforcers do.
Temporary Effects: The effects of punishment are often temporary.
**