Lecture 2: learning theory

Impact of Training on Animal Welfare

Training can impact animal welfare in two main directions: negatively or positively, depending on how it is used.
- Negative impacts occur when unethical or aversive methods are used, triggering anxiety, fear, or pain in the patient, or when training (even if ethical) is too difficult and causes frustration.
- Training can also be dangerous if reinforcement is missed or unmet expectations lead to aggression (e.g., marine mammal trainers). This can weaken the human–animal relationship.
Positive welfare outcomes from training when done ethically:
- Uses methods that trigger positive emotions and minimize frustration.
- Provides predictability and a sense of control over the environment, reducing stress.
- Aligns with welfare principles (e.g., the direct model) that communicate owner expectations of behavior, increasing predictability and control for both owner and animal.
- Can be enriching and facilitate learning, as animals are highly motivated to learn to adapt.
- Strengthens the human–animal relationship.
Practical implications for household and veterinary contexts:
- Clear rules in a household are not inherently bad; predictable expectations can reduce stress.
- When done well, training can improve welfare and the quality of interactions between people and animals.

Training in the Veterinary Context

As family professionals, we are trusted sources of advice on animal training and training problems.
- Owners will seek guidance online or from friends; high-quality, evidence-informed advice from veterinarians is highly valued.
Training can be applied to facilitate veterinary procedures and improve welfare for patients and staff:
- Improves connection with patients, reducing risk of injuries from aggression.
- Improves client satisfaction as owners see their animals are more comfortable, potentially increasing preventative and early intervention care.
Traditional restraint in practice can be emotionally negative for animals (fear, anxiety, pain) and may be justified by "for the animal's own good". There are better, increasingly used approaches:
- Cooperative care and preparation for procedures to minimize restraint and stress.
- These approaches are part of broader low-stress or fear-free veterinary practice, not the sole solution.
Example video (Dartmoor Zoo, UK): tiger training to facilitate veterinary procedures without general anesthesia for routine checks and dental work
- Tigers learn to touch and position themselves to allow abdomen checks, mouth checks, and injections.
- Training improves staff welfare and animal welfare by reducing stress and risk of injury.
- Techniques demonstrated include positive reinforcement and cooperative handling.
Fear-Free / Low-Stress Accreditation:
- Fear Free (USA) is well-known internationally.
- In Europe, Dog Friendly Clinic (Dogs Trust) and Cat Friendly Clinic (BioCat Care / ISFM) are recognized accreditations.
Learning theory definitions:
- Learning: the process by which an animal modifies its behavior as a result of experience.
- Any species can learn if training is appropriate to the species and the individual.
- Adapting training to the individual: e.g., juvenile dogs (short, playful sessions); arthritic older dogs may require gentler, longer sessions; cats may have different motivators than dogs; rabbits may require different tasks.
Training as enrichment and survival adaptation:
- Learning is a survival strategy; animals are motivated to learn.
- Properly implemented training can be enriching and support welfare.

Learning and Conditioning: Key Concepts

Two broad types of associative learning:
- Classical conditioning (Pavlovian): association forms between a neutral stimulus and an unconditioned stimulus, leading to a conditioned response.
- Operant conditioning (instrumental): learning is driven by consequences of behavior (trial and error).

Classical Conditioning: Overview

Basic sequence:
- Unconditioned stimulus (US) naturally elicits an unconditioned response (UR).
- A neutral stimulus (NS) is paired with the US until it becomes a conditioned stimulus (CS) that elicits a conditioned response (CR).
Terminology (example with Pavlov’s dogs):
- US = food (unconditioned stimulus). $US = ext{food}$
- UR = salivation/happy response. $UR = ext{salivation}$
- NS = a bell (initially neutral, no response).
- After conditioning, CS = bell; CR = salivation/happiness in response to bell. $CS = ext{bell}, \ CR = ext{salivation/happiness}$
Key requirements for conditioning:
- The CS must precede the US (predicts the arrival of the US).
- Temporal contiguity: the CS and US must be closely and consistently paired in time (contingency and contiguousness).
Important properties:
- Emotions are central in veterinary practice (emotional conditioning matters for welfare).
- Conditioning can occur quickly; sometimes, fear or anxiety can be classically conditioned after a single pairing.
- Stimuli can generalize to similar contexts or situations.
Everyday examples:
- Carriage/travel: a child’s fear associated with a car ride after a painful vaccination event.
- Shoes: a dog becomes excited when owner puts on shoes because it predicts a walk.
- Rustle of a bag: in a cat or dog, the sound becomes a cue for food or a meal.
Practical notes:
- Avoid creating negative conditioning; be mindful of the emotional state when introducing cues.
Video example: goldfish training using lights to indicate correct behavior and deliver a food reward
- The fish learns to associate light cues with a reward.
- Observed indicators of welfare: engaged, anticipating, and approaching the target cue for reward.
Role of conditioning in practice:
- Classical conditioning is foundational for behavior modification, desensitization, and counterconditioning.

Operant Conditioning: Overview

Definition: trial-and-error learning where the animal makes a conscious choice about behavior; consequences determine future repetitions.
Consequences and behavior:
- If a behavior is followed by a desirable outcome, it is more likely to be repeated; if followed by an undesirable outcome, it is less likely to be repeated.
The four quadrants (reinforcement vs punishment; positive vs negative):
- Positive reinforcement (PR): add something to strengthen the behavior (e.g., treat for sitting).
- Negative punishment (NP): remove something desirable to reduce the undesired behavior (e.g., withhold a treat when the dog does not sit).
- Positive punishment (PP): add something aversive to reduce the undesired behavior (e.g., shock collar; added discomfort for barking or pulling).
- Negative reinforcement (NR): remove something aversive to strengthen the desired behavior (e.g., release pressure after the dog stops pulling).
Important clarifications about terms:
- Positive/negative do not imply good/bad; they indicate addition or removal of a stimulus.
- Punishment is a consequence that reduces a behavior; it is not inherently bad, but positive punishment is generally discouraged due to welfare and learning concerns.
Practical cautions:
- Positive punishment is generally avoided because it can cause pain, fear, anxiety, weaken the human–animal bond, and may provoke aggression.
- Negative punishment should be used in combination with positive reinforcement; using punishment alone can hinder learning and increase frustration.
Practical examples:
- Positive reinforcement: giving a treat when a dog sits.
- Negative punishment: withholding a treat when the dog does not sit.
- Positive punishment: applying a shock to stop barking or too much pulling.
- Negative reinforcement: releasing pressure when the dog reduces pulling, thereby increasing the likelihood of walking nicely.
Connection to practice:
- Reinforcement schedules modulate learning speed and persistence.
- The timing of reinforcement is critical for strengthening the desired behavior.

Reinforcers and Schedules

Types of reinforcers:
- Primary reinforcers (unlearned, biologically important): include food, water, play, exploration. Some animals may find social interaction or play as primary reinforcers.
- Secondary reinforcers (learned, via association): praise, tapping, clicker, certain sounds; often require classical conditioning to be effective.
- The clicker is a common example of a secondary reinforcer (a conditioned cue that signals that a primary reinforcer is forthcoming).
- The framework principle (lifestyle reward): the reward for doing the desired behavior is getting to do something the animal already wants to do (e.g., allowing a dog to go outside only after it has calmed down).
Individual differences in reinforcement:
- Animals have personal hierarchies of reinforcement; preferences vary (e.g., one dog may love popcorn, others may prefer cheese or a toy).
- Cats and rabbits may be motivated by different rewards than dogs; always determine species- and individual-specific preferences.
- Stress can alter reinforcement effectiveness: in a stressed animal, appetite for food may be suppressed; other forms of reassurance or trust-building may be needed.
Practical considerations:
- For obesity concerns, portion control and timing matter; smaller, chopped rewards can be used to maximize value per unit of food.
- When animals are about to undergo anesthesia or procedures, alternative methods such as water rewards or non-food cues can be used to maintain motivation without overeating.
Reinforcement schedules:
- Continuous reinforcement (every correct response) helps establish a new behavior quickly.
- Intermittent reinforcement strengthens behavior more persistently (e.g., after a variable number of correct responses; similar to a slot machine effect).
- Jackpot rewards can be used as a rare, exceptionally large reward for exceptional performance, but should be used sparingly to avoid loss of focus.
Practical considerations for using reinforcement:
- The timing of reinforcement should be immediate and properly aligned with the target behavior; use cueing tools to mark the exact moment the desired behavior occurs (e.g., a clicker or a marker word).
- If you reinforce too late or for incorrect behaviors (e.g., rewarding eye contact while the dog is not in the desired position), you can mislead the animal about what you’re rewarding.
- A continuous reinforcement schedule can be followed by a gradual shift to partial reinforcement to strengthen behavior over time.

Tools and Techniques in Practice

Marking and bridging tools:
- Clicker or audible marker (e.g., words like "yes" or "good"): used to precisely mark the desired behavior so you can deliver reinforcement promptly.
- A target stick can be used to guide the animal to touch it with nose or paw, enabling trainers to maneuver the animal into new positions.
Training sequences and shaping:
- Shaping: break a complex task into smaller, manageable steps and reinforce gradually as each step is performed correctly.
- Capturing: reward a behavior as it occurs naturally, without prompting.
- Chaining: link together a series of behaviors to complete a more complex task.
Cueing hierarchy:
- Start with a body language cue (e.g., a hand gesture) since animals often respond better to visual cues than verbal ones.
- Then add a verbal cue once the body cue is established.
- The order is body language cue first, then verbal cue; allow the cue to stay distinct from the behavior before adding more cues.
Training plan and criteria:
- Define the goal behavior and the steps required to achieve it.
- Set training criteria (e.g., 80% success or 4/5 trials, or 8/10 trials) to determine when to advance to the next step.
- If performance is around 60%, maintain the current step and practice more; if 50% or less, make the task easier.
- Trainers should be flexible and adaptive and adjust training to different locations due to varying distractions.
Training environment and generalization:
- Train in multiple locations to ensure the behavior generalizes across contexts and is not limited to a single environment.
Application in complex behaviors:
- For complex tasks, plan training with a clear hierarchy of steps and a plan to increase difficulty gradually.
Specific practice tips:
- Use a continuous reinforcement schedule early on for new behaviors, then gradually move to variable schedules for stability.
- Use jackpot rewards sparingly, late in the session, to avoid disrupting concentration.
Ethical considerations in training:
- Favor positive reinforcement and negative punishment over positive punishment or aversive methods.
- Avoid relying on negative punishment alone; combine with positive reinforcement to maintain motivation and reduce frustration.
- Remember that negative reinforcement and positive punishment often form a linked sequence and should be used with caution.

Behavioral Modification Techniques

Response substitution (differential reinforcement of incompatible behavior):
- Train the animal to perform a compatible behavior to replace an undesirable one (e.g., hold something in mouth to replace barking).
Extinction:
- A previously reinforced behavior fades when reinforcement is no longer provided; can be accompanied by an extinction burst (temporary increase in the undesired behavior) and potential frustration unless a motivating alternative is taught.
Conditioning and counterconditioning:
- Conditioning (classical): change emotional response from negative to positive by pairing a fear-evoking stimulus with a positive outcome.
- Desensitization: gradually expose the animal to a stimulus at a low level the animal can tolerate, increasing exposure gradually.
- Counterconditioning: pair the stimulus with a positive outcome to change the emotional response.
Nonassociative learning (habituation and sensitization):
- Habituation: a decreased response to a stimulus after repeated exposure without any consequence.
- Desensitization is similar in approach but used for increasing exposure gradually; sensitization is the opposite process (increased response) and is typically undesirable for welfare.
Other learning concepts:
- Nondiscrimination of hearing (nonassociative learning): changes in behavior due to the frequency or duration of exposure rather than the consequences.
- Observational learning: learning by watching others perform a behavior (common in primates; less common in dogs and cats).
- Insight learning and latent learning: behaviors appear after processing information learned earlier; often studied in research.

Ethical Framework and Best Practices

Ethical conditioning emphasizes a combination of reinforcement-based methods:
- Emphasize positive reinforcement paired with negative punishment to reduce unwanted behaviors.
- Avoid positive punishment due to risk of fear, pain, aggression, and weakening the human–animal bond.
Practical guidance for clinicians and students:
- Always prioritize welfare-first approaches and avoid methods that cause distress.
- Adapt training plans to individual animals, considering age, health, stress levels, and past experiences.
- Use environment and enrichment to support learning and welfare.
- Maintain clear communication with clients about training goals, expectations, and safety considerations.

Real-World Considerations and Takeaways

Training is not a one-size-fits-all approach; tailor to species, breed, age, health, and individual temperament.
The welfare benefits of training come from predictability, control, enrichment, and a strengthened human–animal bond when done with ethical practices.
In veterinary settings, cooperative care and fear-free practices can reduce restraint needs and improve patient and staff safety while encouraging preventative and early-intervention care.
The use of training tools (clickers, target sticks) and cue sequencing (body language first, then verbal) helps precision and reduces unintended reinforcement.
Consistent and well-planned training with appropriate reinforcement schedules builds durable skills and improves welfare for both animals and caregivers.

Quick Reference: Key Terms and Concepts

US, UR, CS, CR: foundational classical conditioning terms.
- $US = \text{food}$
- $UR = \text{salivation}$
- $CS = \text{bell}$ (after conditioning)
- $CR = \text{salivation/happiness}$ in response to the bell
Contingency and contiguousness: timing and predictability of CS–US pairing.
Positive reinforcement (PR): add a reward to strengthen a behavior.
Negative reinforcement (NR): remove an aversive stimulus to strengthen a behavior.
Positive punishment (PP): add an aversive stimulus to reduce a behavior.
Negative punishment (NP): remove a desirable stimulus to reduce a behavior.
Primary reinforcers: food, water, play, exploration.
Secondary reinforcers: praise, clicker, conditioned cues.
Framework principle: reward the animal with something they want in return for the target behavior.
Continuous reinforcement (CRF) vs intermittent reinforcement (e.g., VR, VI schedules).
Jackpot: a large, one-off reward for exceptional performance.
Shaping, capturing, cueing, and task analysis for complex behaviors.
Desensitization and counterconditioning: increase tolerance to a stimulus and pair it with a positive outcome.
Habituation and sensitization: nonassociative learning processes.
Observational and insight/latent learning: learning from others and processing knowledge to act later.