Psych Lecture 5- Reinforce and Punish

Reinforcement and Punishment: Core Concepts

Reinforcement vs Punishment (overview)
- Reinforcement: aim is to increase the probability or occurrence of a behavior.
- Punishment: aim is to decrease the probability or occurrence of a behavior.
- Positive vs Negative terminology refers to adding or removing stimuli, not desirability.
- Positive reinforcement examples: giving something desirable after a behavior (e.g., a gold star for good behavior in school).
- Negative reinforcement examples: removing an aversive stimulus to increase a behavior (e.g., stopping an annoying alarm when you buckle up).
- Punishment by application (positive punishment): add an unpleasant outcome to decrease a behavior (e.g., spanking, nagging, shouting, criticism).
- Punishment by removal (negative punishment): remove a desirable stimulus to decrease a behavior (e.g., losing privileges, phone, curfew).
- Punishment should be used carefully and ethically; often emphasized to favor reinforcement strategies.

Schedules of Reinforcement (overview)

Schedules specify when reinforcement is delivered after a behavior.
Continuous reinforcement (CRF): reinforcement after every occurrence of the target behavior.
Intermittent (partial) reinforcement: reinforcement after some but not all occurrences of the behavior; tends to produce more persistent behavior over time.
Key point from lecture: intermittent schedules are typically more effective for long-term behavior maintenance than continuous schedules; fixed schedules can lead to complacency or predictability, whereas variable schedules maintain response strength through unpredictability.

Fixed vs Variable Schedules (definitions with notation)

Fixed Ratio (FR): reinforcement after a fixed number of responses.
- Notation: $FR_{n}$ where n is the number of responses required.
- Example from transcript: reinforcement after every $n=5$ correct behaviors → after every 5 good behaviors, the subject receives a candy.
Variable Ratio (VR): reinforcement after an unpredictable number of responses.
- Notation: $VR_{r}$ where r indicates an average number of responses; the exact number varies around that average.
- Characteristic: highly resistant to extinction; behavior continues even if reinforcement is not immediately forthcoming.
Fixed Interval (FI): reinforcement after a fixed amount of time has passed.
- Notation: $FI_{t}$ where t is the fixed time interval (e.g., every 10 days).
- Example from transcript: every ten days you receive cookies.
- Characteristic: predictable timing; behavior tends to peak near the time of reinforcement.
Variable Interval (VI): reinforcement after a variable amount of time, unpredictable.
- Notation: $VI_{t}$ where the interval is around a mean value t but varies;
- Characteristic: more steady response rate; harder to extinguish because reinforcement is unpredictable.

Continuous vs Intermittent Reinforcement (practical implications)

Continuous reinforcement (CRF): reinforcement after every target behavior.
- Pros: quick learning of the association.
- Cons: quick extinction if reinforcement stops; may lead to dependence on reinforcement.
Intermittent reinforcement (partial reinforcement): reinforcement on some occasions only.
- Pros: more robust against extinction; maintains behavior longer in the absence of reinforcement.
- Cons: acquisition may be slower than CRF.
Practical takeaway from lecture: Intermittent schedules (FR, VR, FI, VI) are typically more effective for sustaining long-term behavior; type (ratio vs interval) and predictability influence how strong and persistent the behavior is.

Examples and Discussion from Transcript

Example: FR-5 (Fixed Ratio 5)
- After every 5 good behaviors, reinforcement is given (e.g., candy).
Example: FI-10 days (Fixed Interval)
- Reinforcement occurs at a fixed time point (every 10 days); reinforcement is predictable.
Example: VI (Variable Interval)
- Reinforcement could come after 3, 5, 7 behaviors or after irregular time points; unpredictable.
Discussion on effectiveness for long-term behavior:
- Intermittent schedules are preferable for maintaining behavior; fixed schedules can reduce motivation to maintain behavior because reinforcement is predictable and may lead to complacency.
- The transcript suggests that whether it’s a ratio or an interval in intermittent schedules, unpredictability helps maintain behavior.

Punishment: Types, Examples, and Nuances

Positive punishment (punishment by application): add something unpleasant to decrease a behavior.
- Examples mentioned: spanking, nagging, shouting, criticism.
- Clarification: the term “positive” refers to adding a stimulus, not whether it is good or bad.
Negative punishment (punishment by removal): remove something desirable to decrease a behavior.
- Examples: taking away a phone, losing privileges, curfew, or canceling a trip.
Negative reinforcement (not punishment): remove an aversive stimulus to increase a behavior.
- Example: buckling up to stop the alarm or to avoid scolding; the behavior increases because the unpleasant stimulus is removed.
Punishment by removal vs reinforcement confusion in examples:
- In some cartoon scenarios, participants attempted to classify actions as positive punishment, negative punishment, or reinforcement; discussion highlighted the need to distinguish between adding vs removing stimuli and between reinforcing vs punishing outcomes.
Ethical and practical considerations:
- Spanking and harsh punishment are deemed inappropriate and ethically problematic.
- Punishment can be effective if it clearly communicates the behavior that is unacceptable, the consequences, and the expected alternative behaviors; but it is often less effective and can produce negative attitudes, sneaky behavior, or avoidance.

Group Discussion Takeaways (behavioral outcomes under different reinforcement approaches)

One person receives only positive reinforcement for good behavior; the other receives only removal punishment for bad behavior.
- Predicted differences:
- Positive reinforcement group tends to show more willingness to repeat desirable behaviors; better mood and attitudes; higher likelihood of replicating the behavior.
- Removal punishment group may develop negative attitudes, resentment, and avoidance; may become sneaky to avoid punishment; could become discouraged or feel unfairly treated.
The facilitator emphasized:
- Clarity about which behaviors are acceptable and what constitutes reinforcement/punishment.
- Consistency is crucial in applying discipline, especially for punishment.
- Punishment should be employed thoughtfully and ethically; emphasize the role of explanation: why the behavior is wrong, what the consequences are, and what behaviors are expected instead.
Implication for practice:
- Reinforcement strategies, particularly consistent positive reinforcement, are generally more effective and ethically sound for shaping long-term behavior, especially in educational or parenting contexts.

Practical Formulas and Notation (Study Reference)

Notation for reinforcement schedules:
- Fixed Ratio: $FR_{n}$
- Variable Ratio: $VR_{r}$
- Fixed Interval: $FI_{t}$
- Variable Interval: $VI_{t}$
Key concepts:
- Continuous Reinforcement: CRF (reinforcement after every response)
- Intermittent Reinforcement: partial reinforcement (FR, VR, FI, VI)
Examples from transcript:
- FR-5: reinforcement after every 5 responses → candy after 5 good behaviors
- FI-10: reinforcement every 10 days → cookies
- VI: reinforcement at unpredictable times/intervals

Connections to Foundations and Real-World Relevance

Grounding in operant conditioning theory: reinforcement strengthens behaviors; punishment weakens behaviors; the schedules determine how reinforcing events are delivered.
Real-world relevance:
- Education: use positive reinforcement to encourage study habits, timely homework, participation, etc.
- Parenting: reward compliance and positive behaviors; avoid excessive punishment; maintain consistency.
- Workplace training: use contingent rewards to shape performance; consider unpredictability to sustain motivation.
Ethical implications:
- Favor positive reinforcement due to ethical considerations and lower risk of negative side effects (anxiety, resentment).
- When punishment is used, ensure it is proportionate, clearly explained, and paired with guidance on acceptable alternatives; maintain consistency across contexts.

Quick Reference: Key Takeaways

Reinforcement increases behavior; punishment decreases it.
Positive reinforcement adds a desirable outcome; negative reinforcement removes an aversive stimulus to increase behavior.
Positive punishment adds an undesirable outcome; negative punishment removes a desirable stimulus to decrease behavior.
Continuous reinforcement yields fast learning but weaker long-term persistence; intermittent reinforcement yields better long-term maintenance.
In intermittent schedules, predictability matters: variable schedules tend to sustain behavior longer than fixed schedules.
For long-term behavior change, intermittent reinforcement (FR, VR, FI, VI) is typically more effective, with a preference for reinforcement-based strategies over punishment.
Always consider ethical implications and aim to use reinforcement as the primary tool; use punishment sparingly, clearly, and consistently when necessary, with clear expectations and consequences.

End-of-Notes: Study Prompts

Define each schedule and give one real-world example for FR, VR, FI, VI.
Explain why intermittent reinforcement often sustains behavior longer than continuous reinforcement.
Distinguish between positive punishment and negative punishment with an example for each.
Describe a practical approach a teacher or parent could use to shift from punishment-focused strategies to reinforcement-focused strategies while maintaining clear expectations.