Learning Notes: Conditioning, Schedules, Shaping, Social Learning, and Memory Encoding
Conditioning and Schedules
- Operant (operational) conditioning focuses on voluntary behaviors shaped by consequences: reinforcement increases the strength/frequency of the behavior that it follows; punishment decreases it.
- Continuous (or 1:1 reinforcement) schedule: every instance of the behavior is reinforced.
- Leads to fast learning but also rapid extinction when reinforcement stops.
- Partial (intermittent) schedules: reinforcement is provided only some of the time, not after every response; generally more resistant to extinction and better for long-term learning.
- Distinction between reinforcement and punishment applies to both ratio and interval schedules.
Intermittent/Partial Schedules: Ratios vs Intervals
- Ratio schedules: focus on the number of behaviors before reinforcement.
- Fixed Ratio (FR): reinforcement after a fixed number of responses (e.g., FR-10: every 10th response is reinforced).
- Variable Ratio (VR): number of responses required varies around an average; reinforcement delivered after an unpredictable number of responses (e.g., averages 10, but could be 5, 15, 9, 11, …).
- Interval schedules: focus on time elapsed before reinforcement, regardless of word-for-word response count.
- Fixed Interval (FI): reinforcement becomes available after a fixed time interval; the first response after that interval yields reinforcement (e.g., FI-10s).
- Variable Interval (VI): the interval length varies around an average; reinforcement is delivered for the first response after an unpredictable interval (e.g., average 10s, but could be 1s, 5s, 19s, 11s, …).
Consequences and Response Patterns
- Fixed schedules tend to produce pauses after reinforcement and then a ramp-up in responding as the interval or ratio nears completion.
- Variable schedules produce high, steady rates of responding with fewer predictable pauses; they are more resistant to extinction than fixed schedules.
- Comparing schedules on a graph (responses vs time):
- Fixed Ratio (FR) yields high, rapid responding with abrupt pauses after reinforcement.
- Variable Ratio (VR) yields high, steady responding with less predictable pauses (greatest resistance to extinction).
- Fixed Interval (FI) yields post-reinforcement pauses followed by a gradual acceleration toward the end of the interval.
- Variable Interval (VI) yields steady, moderate responding with less pronounced acceleration toward the end.
- Key takeaway: ratio schedules (especially VR) produce higher response rates than interval schedules (FI/VI); interval schedules generally produce slower rates of responding; extinction patterns differ across schedules.
Real-World Examples of Schedules
- Fixed Ratio (FR): postal/customs data-entry shift example
- Workers must process a fixed number of packages (e.g., 600) per 8-hour shift to earn their paycheck.
- This creates a goal-driven, high-rate burst of work, followed by a pause after each payoff.
- Example in transcript: 600 packages per shift, with a build-up period before the belt moves rapidly and the 8-hour shift rhythm is completed.
- Concept: fixed ratio → predictable payout after fixed number of actions.
- Variable Ratio (VR): gambling and sales
- Slot machines pay out on a variable-number-of-responses basis; you never know when the next win will come, so you keep playing.
- Sales: successful sales may occur infrequently, but the reward reinforces persistent effort.
- Example in transcript: fishing (catching a fish per varying number of casts) and other addictive-like behaviors follow VR patterns.
- Fixed Interval (FI): clock-watching and baking examples
- Clock-watching when a shift is near end: you become increasingly attentive to the clock as the fixed period ends.
- Baking cookies: you check the oven toward the end of a fixed baking time, e.g., after 12 minutes.
- Variable Interval (VI): surfing example
- Waves arrive at irregular intervals; the optimal strategy is to keep checking/attending and respond at a steady rate rather than pausing until reinforcement is guaranteed.
Continuous vs Partial Schedules: Learning and Extinction
- Continuous reinforcement leads to rapid acquisition but rapid extinction when reinforcement stops.
- Partial schedules promote longer-lasting learning, with VR showing the strongest resistance to extinction among the traditional schedules.
Shaping and the Role of Reinforcement
- Shaping: a process of reinforcing successive approximations toward a complex behavior.
- Example: training a dog to ride a Frisbee down a field and return it.
- Stepwise progression: reinforce when the dog moves toward the Frisbee, then when it catches it, then when it brings it back, etc.
- Shaping leverages inborn tendencies and natural predispositions to speed up learning (conditioning is most effective when aligned with natural tendencies).
- Instinctive drift: organisms tend to revert to instinctive behaviors that interfere with operant conditioning (e.g., Breland & Breland pig refusing to insert a coin into a piggy bank because rooting instincts surface; raccoons polishing shiny coins).
- Conditioning may be ineffective or counterproductive if the response conflicts with inborn tendencies or when reinforcement encourages mediocrity (undermining intrinsic motivation).
Extrinsic vs Intrinsic Reinforcement and Motivation
- Extrinsic reinforcers originate outside the activity and are not inherently related to the task (e.g., money, ribbons).
- Intrinsic reinforcers originate inside the individual and are inherently related to the activity (e.g., pride, sense of accomplishment).
- The overjustification effect: extrinsic rewards can undermine intrinsic motivation by shifting motivation from internal satisfaction to external rewards (e.g., ribbons reducing spontaneous drawing time).
- Example from study: two classrooms—one received a blue ribbon for drawing; afterwards, the ribbon classroom drew far less often (8%) compared to baseline (60%), suggesting external rewards reduced intrinsic drawing enjoyment.
- Important caution: reinforcing mediocrity can decrease motivation to improve; reinforcement must be aligned with genuine interest or mastery goals.
The ABCs of Learning: Antecedents, Behaviors, Consequences
- Antecedents: stimuli or events that precede a behavior (classical conditioning focus).
- Behavior: the observable response.
- Consequences: reinforcement or punishment that follows the behavior.
- In operant conditioning, the order is: antecedent (stimulus), behavior, consequence.
- Fundamental contrast: classical conditioning involves stimuli before a reflexive response; operant conditioning involves a behavior followed by a consequence.
- The ABCs framework helps explain how learning occurs through experience and interaction with the environment.
Classical Conditioning vs Operant Conditioning: Cognitive Revolution Link
- Behaviorism emphasized stimulus-response relationships and largely ignored mental processes.
- Latent learning (Tolman & Honzik): learning can occur without reinforcement and may not be immediately expressed; when reinforcement appears, knowledge is revealed quickly—implies cognitive processes and mental representations in learning.
- Observations challenged behaviorism’s “black box” assumption; cognition and mental events matter.
- Cognitive revolution: mind as an information-processing system; the mind is not a general-purpose, equally efficient processor for all information; it is specialized and shaped by evolution.
- Basics of information processing: input (environmental stimuli), processing (attention, memory, encoding, retrieval), output (behavior).
- Early cognitive models assumed general-purpose processing; later work recognized specialized processing for social information (social cognition).
Social Learning and Social Cognition
- Social learning (Bandura): much of human learning occurs by observing others (modeling) and is safer and more efficient than trial-and-error learning.
- Bobo doll experiments (Bandura): children imitate aggressive behavior after observing an adult model; aggression increases with aggressive models (live, filmed, or cartoon), while nonaggressive models reduce aggression.
- Key concept: vicarious learning—learning through observing the consequences of others’ actions, even if the learner does not directly experience reinforcement or punishment.
- Observational learning applies to humans and other species (e.g., chimpanzees showing imitation and social learning; facial expressions, laughter, and communicative behaviors).
- In social learning, attention, memory, reproduction, and motivation are essential cognitive components for successfully imitating observed behaviors.
- Social learning supports cultural evolution: knowledge, skills, and technologies are transmitted across generations, often more efficiently than individual trial-and-error discovery.
- Example of cross-species social learning: chimpanzees learn a model’s actions (e.g., imitation of laughter or tongue protrusion) and imitate some actions under observation.
Memory Encoding: From Structural to Self-Referent Encoding
- Encoding techniques and recall:
- Structural encoding: based on physical structure (e.g., letters, font, surface features).
- Phonemic encoding: based on sound (phonology) of items.
- Semantic encoding: based on meaning of items.
- Research findings (combined across studies in lecture): semantic encoding tends to produce better recall than phonemic or structural encoding; self-referent encoding (an enriched form of semantic encoding) yields even greater recall.
- Self-referent encoding: relating information to oneself increases depth of processing and memory retention; widely used to improve study strategies.
- Enrichment strategies: consider how information relates to your own experiences or beliefs to improve retention.
Memory Storage and Capacity (Three-Box Model)
- Sensory memory (sensory register): initial, very brief storage of sensory information. Capacity is large but duration is very short (fraction of a second to a few seconds depending on modality).
- Short-term memory (working memory): holds information briefly (often cited as 3-4 chunks in this course, though other sources quote 5-9 items with chunking). Duration is around minutes unless rehearsed; capacity is limited and often described as 3-4 chunks.
- Long-term memory: potentially infinite capacity; information can be stored for long durations and retrieved later.
- The typical classroom question: which has the smallest capacity? The answer in this lecture is short-term memory (3-4 chunks), with sensory memory having larger capacity but much shorter duration.
- Implication: not all sensory information is transferred to short-term memory; only a subset is encoded into working memory, and then potentially into long-term memory via encoding processes.
Retrospective vs Prospective Memory
- Retrospective memory: memory for past events, experiences, or information; can be transferred or discussed verbally (declarative memory).
- Prospective memory: memory for future intentions or tasks (e.g., remembering to perform an action in the future).
- Declarative (explicit) memory includes episodic (events) and semantic (facts) memory; these forms can be communicated verbally and taught to others.
- Nondeclarative (implicit) memory includes procedural memory (skills and tasks) and conditioned responses; it is not typically transferred by simple verbal instruction.
- Question example from lecture: retrospective forms of memory that can be transferred by talking about them are typically declarative memories (episodic and semantic); hence, retrospective memory tasks are often verbalizable and teachable, whereas procedural memories are not easily conveyed by description alone.
Cognitive and Social-Cognitive Integration
- Social cognitive theory emphasizes the integration of behavioral analysis with cognitive processes (attention, memory, reproduction, motivation) and attitudes/beliefs/expectations.
- The environment and social context shape what is paid attention to, remembered, and reproduced.
- Human learning is highly dependent on social information, imitation, and cultural transmission; cognition shapes the selection and valuation of models to imitate.
Illustrative Examples and Takeaways
- Toy examples reinforce understanding of schedules: vending machines illustrate continuous reinforcement; toy fishing or gambling illustrate interval or ratio schedules in everyday life.
- Shaping shows how complex behavior can be constructed from simple actions via reinforcement of successive approximations.
- Instinctive drift demonstrates limits of conditioning when reinforcement attempts clash with evolved predispositions.
- Extrinsic rewards can undermine intrinsic motivation; design reinforcement to support mastery and internal satisfaction when possible.
- Latent learning (Tolman & Honzik) highlights that knowledge can exist without immediate reinforcement and may be revealed later when incentive appears.
- The cognitive revolution introduced the mind as an information processor; social cognition and social learning account for why humans excel at learning from others and sharing knowledge across generations.
Quick Takeaways for Exam Preparation
- Distinguish between continuous and partial schedules; know the four partial schedules and which produce high vs slow response rates:
- FR (fixed ratio) – high response rate with post-reinforcement pauses
- VR (variable ratio) – high, steady response; most resistant to extinction
- FI (fixed interval) – slow, rising towards end of interval
- VI (variable interval) – steady, moderate responding
- Shaping uses successive approximations; leverage natural predispositions to facilitate rapid learning; beware instinctive drift.
- Observational learning: Bandura’s Bobo doll shows vicarious learning; social learning is efficient and culturally relevant.
- Memory encoding: semantic encoding > phonemic/structural encoding; self-referent encoding yields the best recall.
- Three-box memory model: sensory memory (large capacity, very short duration), short-term memory (3-4 chunks; limited duration), long-term memory (large/infinite capacity).
- ABCs of learning: antecedent, behavior, consequence; classical conditioning vs operant conditioning differences.
- Extrinsic rewards can undermine intrinsic motivation; design reinforcement to support internal satisfaction and competence.
- Prospective vs retrospective memory: declarative memory (episodic/semantic) is typically retrospective; prospective memory concerns future tasks.
- Fixed Ratio: reinforcement after a fixed number of responses. Example: FR-10 means reinforcement after every 10 responses.
- Variable Ratio: reinforcement after an unpredictable number of responses; average around a value (e.g., VR-10).
- Fixed Interval: reinforcement after a fixed amount of time has passed since the last reinforcement (FI-t, where t is time in seconds/minutes).
- Variable Interval: reinforcement after an unpredictable interval around an average (VI-t).
- Encoding depth: semantic encoding often yields deeper processing than phonemic or structural encoding; self-referent encoding enhances semantic encoding: deeper processing → better recall.
- Short-term memory capacity (as discussed in lecture): about 3−4 chunks; with effective chunking this can rise toward 5−9 items, often cited as 7±2 in broader literature.
- Memory duration: sensory memory persists for a fraction of a second to a few seconds; short-term memory persists for about minutes without rehearsal; long-term memory is potentially infinite.
Ethical, Philosophical, and Practical Implications
- Be mindful of the overjustification effect when using external rewards to motivate tasks that people inherently enjoy.
- When teaching or shaping behavior, consider natural predispositions to avoid instinctive drift and to maximize learning efficiency.
- Recognize the power of social learning and culture in shaping human behavior and technologies; emphasize safe, prosocial modeling.
- In education and workplace design, use reinforcement strategically to promote mastery, not just compliance.
Notes on Exam Preparation and Study Strategy
- Create mind maps to organize schedules and their properties (continuous vs partial; FR/VR/FI/VI; fixed vs variable; ratios vs intervals).
- Practice explaining concepts aloud with real-world examples to ensure depth of understanding.
- Review Tolman’s latent learning and Bandura’s social learning experiments to understand cognitive and social aspects of learning.
- Practice identifying encoding strategies and predicting recall outcomes for different encoding depths.
- Be ready to discuss the ABCs and how they apply to both classical and operant conditioning.
Quick Definitions (Glossary)
- Reinforcement: any consequence that increases the likelihood of a behavior.
- Punishment: any consequence that decreases the likelihood of a behavior.
- Shaping: reinforcing successive approximations toward a complex target behavior.
- Instinctive drift: tendency of an animal to revert to innate behaviors after conditioning efforts.
- Latent learning: learning that occurs without reinforcement but is not shown until reinforcement is available.
- Observational/Social learning: learning by observing others and modeling their behavior.
- Semantic encoding: encoding based on meaning; typically results in strong recall.
- Self-referent encoding: encoding that relates information to oneself; often yields the strongest recall.
- Declarative (explicit) memory: memory for facts and events (episodic and semantic).
- Nondeclarative (implicit) memory: memory for skills and conditioned responses (procedural, priming, etc.).
- Prospective memory: remembering to perform a planned action in the future.
- Sensory memory: brief sensory storage with large capacity but very short duration.
- Short-term (working) memory: temporary storage with limited capacity; duration is short without rehearsal.
- Long-term memory: relatively permanent storage with large/infinite capacity.