Learning Notes: Conditioning, Schedules, Shaping, Social Learning, and Memory Encoding

Conditioning and Schedules

Operant (operational) conditioning focuses on voluntary behaviors shaped by consequences: reinforcement increases the strength/frequency of the behavior that it follows; punishment decreases it.
Continuous (or 1:1 reinforcement) schedule: every instance of the behavior is reinforced.
- Leads to fast learning but also rapid extinction when reinforcement stops.
Partial (intermittent) schedules: reinforcement is provided only some of the time, not after every response; generally more resistant to extinction and better for long-term learning.
Distinction between reinforcement and punishment applies to both ratio and interval schedules.

Intermittent/Partial Schedules: Ratios vs Intervals

Ratio schedules: focus on the number of behaviors before reinforcement.
- Fixed Ratio (FR): reinforcement after a fixed number of responses (e.g., FR-10: every 10th response is reinforced).
- Variable Ratio (VR): number of responses required varies around an average; reinforcement delivered after an unpredictable number of responses (e.g., averages 10, but could be 5, 15, 9, 11, …).
Interval schedules: focus on time elapsed before reinforcement, regardless of word-for-word response count.
- Fixed Interval (FI): reinforcement becomes available after a fixed time interval; the first response after that interval yields reinforcement (e.g., FI-10s).
- Variable Interval (VI): the interval length varies around an average; reinforcement is delivered for the first response after an unpredictable interval (e.g., average 10s, but could be 1s, 5s, 19s, 11s, …).

Consequences and Response Patterns

Fixed schedules tend to produce pauses after reinforcement and then a ramp-up in responding as the interval or ratio nears completion.
Variable schedules produce high, steady rates of responding with fewer predictable pauses; they are more resistant to extinction than fixed schedules.
Comparing schedules on a graph (responses vs time):
- Fixed Ratio (FR) yields high, rapid responding with abrupt pauses after reinforcement.
- Variable Ratio (VR) yields high, steady responding with less predictable pauses (greatest resistance to extinction).
- Fixed Interval (FI) yields post-reinforcement pauses followed by a gradual acceleration toward the end of the interval.
- Variable Interval (VI) yields steady, moderate responding with less pronounced acceleration toward the end.
Key takeaway: ratio schedules (especially VR) produce higher response rates than interval schedules (FI/VI); interval schedules generally produce slower rates of responding; extinction patterns differ across schedules.

Real-World Examples of Schedules

Fixed Ratio (FR): postal/customs data-entry shift example
- Workers must process a fixed number of packages (e.g., 600) per 8-hour shift to earn their paycheck.
- This creates a goal-driven, high-rate burst of work, followed by a pause after each payoff.
- Example in transcript: 600 packages per shift, with a build-up period before the belt moves rapidly and the 8-hour shift rhythm is completed.
- Concept: fixed ratio → predictable payout after fixed number of actions.
Variable Ratio (VR): gambling and sales
- Slot machines pay out on a variable-number-of-responses basis; you never know when the next win will come, so you keep playing.
- Sales: successful sales may occur infrequently, but the reward reinforces persistent effort.
- Example in transcript: fishing (catching a fish per varying number of casts) and other addictive-like behaviors follow VR patterns.
Fixed Interval (FI): clock-watching and baking examples
- Clock-watching when a shift is near end: you become increasingly attentive to the clock as the fixed period ends.
- Baking cookies: you check the oven toward the end of a fixed baking time, e.g., after 12 minutes.
Variable Interval (VI): surfing example
- Waves arrive at irregular intervals; the optimal strategy is to keep checking/attending and respond at a steady rate rather than pausing until reinforcement is guaranteed.

Continuous vs Partial Schedules: Learning and Extinction

Continuous reinforcement leads to rapid acquisition but rapid extinction when reinforcement stops.
Partial schedules promote longer-lasting learning, with VR showing the strongest resistance to extinction among the traditional schedules.

Shaping and the Role of Reinforcement

Shaping: a process of reinforcing successive approximations toward a complex behavior.
- Example: training a dog to ride a Frisbee down a field and return it.
- Stepwise progression: reinforce when the dog moves toward the Frisbee, then when it catches it, then when it brings it back, etc.
Shaping leverages inborn tendencies and natural predispositions to speed up learning (conditioning is most effective when aligned with natural tendencies).
Instinctive drift: organisms tend to revert to instinctive behaviors that interfere with operant conditioning (e.g., Breland & Breland pig refusing to insert a coin into a piggy bank because rooting instincts surface; raccoons polishing shiny coins).
Conditioning may be ineffective or counterproductive if the response conflicts with inborn tendencies or when reinforcement encourages mediocrity (undermining intrinsic motivation).

Extrinsic vs Intrinsic Reinforcement and Motivation

Extrinsic reinforcers originate outside the activity and are not inherently related to the task (e.g., money, ribbons).
Intrinsic reinforcers originate inside the individual and are inherently related to the activity (e.g., pride, sense of accomplishment).
The overjustification effect: extrinsic rewards can undermine intrinsic motivation by shifting motivation from internal satisfaction to external rewards (e.g., ribbons reducing spontaneous drawing time).
Example from study: two classrooms—one received a blue ribbon for drawing; afterwards, the ribbon classroom drew far less often (8%) compared to baseline (60%), suggesting external rewards reduced intrinsic drawing enjoyment.
Important caution: reinforcing mediocrity can decrease motivation to improve; reinforcement must be aligned with genuine interest or mastery goals.

The ABCs of Learning: Antecedents, Behaviors, Consequences

Antecedents: stimuli or events that precede a behavior (classical conditioning focus).
Behavior: the observable response.
Consequences: reinforcement or punishment that follows the behavior.
In operant conditioning, the order is: antecedent (stimulus), behavior, consequence.
Fundamental contrast: classical conditioning involves stimuli before a reflexive response; operant conditioning involves a behavior followed by a consequence.
The ABCs framework helps explain how learning occurs through experience and interaction with the environment.

Classical Conditioning vs Operant Conditioning: Cognitive Revolution Link

Behaviorism emphasized stimulus-response relationships and largely ignored mental processes.
Latent learning (Tolman & Honzik): learning can occur without reinforcement and may not be immediately expressed; when reinforcement appears, knowledge is revealed quickly—implies cognitive processes and mental representations in learning.
Observations challenged behaviorism’s “black box” assumption; cognition and mental events matter.
Cognitive revolution: mind as an information-processing system; the mind is not a general-purpose, equally efficient processor for all information; it is specialized and shaped by evolution.
Basics of information processing: input (environmental stimuli), processing (attention, memory, encoding, retrieval), output (behavior).
Early cognitive models assumed general-purpose processing; later work recognized specialized processing for social information (social cognition).

Social Learning and Social Cognition

Social learning (Bandura): much of human learning occurs by observing others (modeling) and is safer and more efficient than trial-and-error learning.
Bobo doll experiments (Bandura): children imitate aggressive behavior after observing an adult model; aggression increases with aggressive models (live, filmed, or cartoon), while nonaggressive models reduce aggression.
Key concept: vicarious learning—learning through observing the consequences of others’ actions, even if the learner does not directly experience reinforcement or punishment.
Observational learning applies to humans and other species (e.g., chimpanzees showing imitation and social learning; facial expressions, laughter, and communicative behaviors).
In social learning, attention, memory, reproduction, and motivation are essential cognitive components for successfully imitating observed behaviors.
Social learning supports cultural evolution: knowledge, skills, and technologies are transmitted across generations, often more efficiently than individual trial-and-error discovery.
Example of cross-species social learning: chimpanzees learn a model’s actions (e.g., imitation of laughter or tongue protrusion) and imitate some actions under observation.

Memory Encoding: From Structural to Self-Referent Encoding

Encoding techniques and recall:
- Structural encoding: based on physical structure (e.g., letters, font, surface features).
- Phonemic encoding: based on sound (phonology) of items.
- Semantic encoding: based on meaning of items.
Research findings (combined across studies in lecture): semantic encoding tends to produce better recall than phonemic or structural encoding; self-referent encoding (an enriched form of semantic encoding) yields even greater recall.
Self-referent encoding: relating information to oneself increases depth of processing and memory retention; widely used to improve study strategies.
Enrichment strategies: consider how information relates to your own experiences or beliefs to improve retention.

Memory Storage and Capacity (Three-Box Model)

Sensory memory (sensory register): initial, very brief storage of sensory information. Capacity is large but duration is very short (fraction of a second to a few seconds depending on modality).
Short-term memory (working memory): holds information briefly (often cited as 3-4 chunks in this course, though other sources quote 5-9 items with chunking). Duration is around minutes unless rehearsed; capacity is limited and often described as 3-4 chunks.
Long-term memory: potentially infinite capacity; information can be stored for long durations and retrieved later.
The typical classroom question: which has the smallest capacity? The answer in this lecture is short-term memory (3-4 chunks), with sensory memory having larger capacity but much shorter duration.
Implication: not all sensory information is transferred to short-term memory; only a subset is encoded into working memory, and then potentially into long-term memory via encoding processes.

Retrospective vs Prospective Memory

Retrospective memory: memory for past events, experiences, or information; can be transferred or discussed verbally (declarative memory).
Prospective memory: memory for future intentions or tasks (e.g., remembering to perform an action in the future).
Declarative (explicit) memory includes episodic (events) and semantic (facts) memory; these forms can be communicated verbally and taught to others.
Nondeclarative (implicit) memory includes procedural memory (skills and tasks) and conditioned responses; it is not typically transferred by simple verbal instruction.
Question example from lecture: retrospective forms of memory that can be transferred by talking about them are typically declarative memories (episodic and semantic); hence, retrospective memory tasks are often verbalizable and teachable, whereas procedural memories are not easily conveyed by description alone.

Cognitive and Social-Cognitive Integration

Social cognitive theory emphasizes the integration of behavioral analysis with cognitive processes (attention, memory, reproduction, motivation) and attitudes/beliefs/expectations.
The environment and social context shape what is paid attention to, remembered, and reproduced.
Human learning is highly dependent on social information, imitation, and cultural transmission; cognition shapes the selection and valuation of models to imitate.

Illustrative Examples and Takeaways

Toy examples reinforce understanding of schedules: vending machines illustrate continuous reinforcement; toy fishing or gambling illustrate interval or ratio schedules in everyday life.
Shaping shows how complex behavior can be constructed from simple actions via reinforcement of successive approximations.
Instinctive drift demonstrates limits of conditioning when reinforcement attempts clash with evolved predispositions.
Extrinsic rewards can undermine intrinsic motivation; design reinforcement to support mastery and internal satisfaction when possible.
Latent learning (Tolman & Honzik) highlights that knowledge can exist without immediate reinforcement and may be revealed later when incentive appears.
The cognitive revolution introduced the mind as an information processor; social cognition and social learning account for why humans excel at learning from others and sharing knowledge across generations.

Quick Takeaways for Exam Preparation

Distinguish between continuous and partial schedules; know the four partial schedules and which produce high vs slow response rates:
- FR (fixed ratio) – high response rate with post-reinforcement pauses
- VR (variable ratio) – high, steady response; most resistant to extinction
- FI (fixed interval) – slow, rising towards end of interval
- VI (variable interval) – steady, moderate responding
Shaping uses successive approximations; leverage natural predispositions to facilitate rapid learning; beware instinctive drift.
Observational learning: Bandura’s Bobo doll shows vicarious learning; social learning is efficient and culturally relevant.
Memory encoding: semantic encoding > phonemic/structural encoding; self-referent encoding yields the best recall.
Three-box memory model: sensory memory (large capacity, very short duration), short-term memory (3-4 chunks; limited duration), long-term memory (large/infinite capacity).
ABCs of learning: antecedent, behavior, consequence; classical conditioning vs operant conditioning differences.
Extrinsic rewards can undermine intrinsic motivation; design reinforcement to support internal satisfaction and competence.
Prospective vs retrospective memory: declarative memory (episodic/semantic) is typically retrospective; prospective memory concerns future tasks.

Formulas and Notation (Illustrative)

Fixed Ratio: reinforcement after a fixed number of responses. Example: FR-10 means reinforcement after every 10 responses.
Variable Ratio: reinforcement after an unpredictable number of responses; average around a value (e.g., VR-10).
Fixed Interval: reinforcement after a fixed amount of time has passed since the last reinforcement (FI-t, where t is time in seconds/minutes).
Variable Interval: reinforcement after an unpredictable interval around an average (VI-t).
Encoding depth: semantic encoding often yields deeper processing than phonemic or structural encoding; self-referent encoding enhances semantic encoding: deeper processing → better recall.
Short-term memory capacity (as discussed in lecture): about $3-4\text{ chunks}$ ; with effective chunking this can rise toward $5-9$ items, often cited as $7 \pm 2$ in broader literature.
Memory duration: sensory memory persists for a fraction of a second to a few seconds; short-term memory persists for about minutes without rehearsal; long-term memory is potentially infinite.

Ethical, Philosophical, and Practical Implications

Be mindful of the overjustification effect when using external rewards to motivate tasks that people inherently enjoy.
When teaching or shaping behavior, consider natural predispositions to avoid instinctive drift and to maximize learning efficiency.
Recognize the power of social learning and culture in shaping human behavior and technologies; emphasize safe, prosocial modeling.
In education and workplace design, use reinforcement strategically to promote mastery, not just compliance.

Notes on Exam Preparation and Study Strategy

Create mind maps to organize schedules and their properties (continuous vs partial; FR/VR/FI/VI; fixed vs variable; ratios vs intervals).
Practice explaining concepts aloud with real-world examples to ensure depth of understanding.
Review Tolman’s latent learning and Bandura’s social learning experiments to understand cognitive and social aspects of learning.
Practice identifying encoding strategies and predicting recall outcomes for different encoding depths.
Be ready to discuss the ABCs and how they apply to both classical and operant conditioning.

Quick Definitions (Glossary)

Reinforcement: any consequence that increases the likelihood of a behavior.
Punishment: any consequence that decreases the likelihood of a behavior.
Shaping: reinforcing successive approximations toward a complex target behavior.
Instinctive drift: tendency of an animal to revert to innate behaviors after conditioning efforts.
Latent learning: learning that occurs without reinforcement but is not shown until reinforcement is available.
Observational/Social learning: learning by observing others and modeling their behavior.
Semantic encoding: encoding based on meaning; typically results in strong recall.
Self-referent encoding: encoding that relates information to oneself; often yields the strongest recall.
Declarative (explicit) memory: memory for facts and events (episodic and semantic).
Nondeclarative (implicit) memory: memory for skills and conditioned responses (procedural, priming, etc.).
Prospective memory: remembering to perform a planned action in the future.
Sensory memory: brief sensory storage with large capacity but very short duration.
Short-term (working) memory: temporary storage with limited capacity; duration is short without rehearsal.
Long-term memory: relatively permanent storage with large/infinite capacity.