Operant Conditioning and Reinforcement
Operant Conditioning Overview
- Definition: Operant conditioning involves learning through associations between stimuli and voluntary responses.
- Key Components: Operant conditioning is characterized by two main concepts: contiguity (context) and contingency (predictions).
The Three-Term Contingency
- Discriminative Stimulus (Sd): Represents the context or situation in which a behavior occurs.
- Response (R): The voluntary action taken by an organism (unlike unconditioned responses).
- Consequence (S*): The outcome of the response, which can be either a reinforcer or a punisher.
Example of Three-Term Contingency
- In context (Sd) --> Response (R) --> Produces consequence (S*)
Thorndike's Law of Effect
- Two Parts:
- A response followed by a satisfying consequence will increase in frequency.
- A response followed by an unsatisfying consequence will decrease in frequency.
- Reinforcer: Satisfying consequence (increases behavior).
- Punisher: Unsatisfying consequence (decreases behavior).
Types of Reinforcers and Punishers
- Positive Reinforcer: Adds something desirable to increase behavior.
- Example: Homework before dinner, add a prize.
- Negative Punisher: Takes away something desirable to reduce behavior.
- Example: Homework not done, take away a privilege.
- Positive Punisher: Adds something undesirable to reduce behavior.
- Example: Homework not done, impose a penalty.
- Negative Reinforcer: Takes away something undesirable to increase behavior.
- Example: Homework done, removal of restrictions.
Behavior Modification Programs
- Reinforcer Hierarchies:
- Includes primary reinforcers (biological — food, safety) and secondary reinforcers (learned, access to primary).
- Response Hierarchies:
- Hull's Habit Families and Skinner's Functional Response Classes.
- Shaping: Method of successive approximations toward desired behavior, reinforced progressively.
- Rat rears anywhere in Skinner box.
- Reinforce rearing in half of box.
- Reinforce rearing in quarter of box.
- Reinforce bar press with food or water.
Schedules of Reinforcement
- Types of Schedules: Ratio and Interval schedules that determine how reinforcement is delivered.
- Ratio Schedules:
- Fixed Ratio: Constant responses lead to reinforcement; results in post-reinforcement pauses.
- Variable Ratio: Reinforcement after an average number of responses; creates unpredictability (similar to gambling).
- Interval Schedules:
- Fixed Interval: Know when the next reinforcer is due; can cause scalloped responding.
- Variable Interval: Uncertain timing of reinforcement; no pauses.
Commonalities Between Classical and Operant Conditioning
- Contiguity: Co-occurrence of stimuli (fundamental for all learning).
- Contingency: Relationship where one event predicts another (both in Pavlovian and Operant conditioning).
- Surprise: Learning occurs best with new or unexpected events.
- Phases of Learning:
- Conditioning: Initial learning phase.
- Extinction: When the predictive relationship breaks down; the behavior is not reinforced anymore.
- Spontaneous Recovery: The re-emergence of a previously extinguished behavior upon reintroduction of the context or stimulus.
- Associative Networks: Neural connections formed through learning that strengthen predictions.
Discussion Topic
- Free Will in Operant Conditioning: Exploring the implications of behavior control and conditioning in relation to free will, discussed in class.