PSYCH1X03 - Instrumental Conditioning

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/51

flashcard set

Earn XP

Description and Tags

Unit/Week 3

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

52 Terms

1
New cards

Instrumental Conditioning (IC)

Training to learn contingency between voluntary behavior and its consequence (aka Operant conditioning)

2
New cards

IC

Thorndike’s Law of Effect

Behaviours with positive consequence are stamped, negative consequences are stamped out

3
New cards

IC

Thorndike’s Puzzle Box

Cat was placed in puzzle box that could be opened by pulling on rope. Cat would do random behaviour until rope was pulled & door opened.

Thorndike predicted that in subsequence trials, cat would escape immediately like a human would. In reality, escape time was linear.

Animals only followed simple stimulus-response type process, lack humans’ “ah-ha!” moment.

4
New cards

IC

Skinner’s Box/Operant Chamber

Apparatus to study instrumental conditioning by rewarding/punishing an animal for doing something

5
New cards

IC

Skinner’s Pigeon Box

Free food is periodically provided to pigeons, pigeons would repeat whatever behaviour that was being performed prior to food in a “superstitious” manner

6
New cards

IC

Reinforcer

Stimulus used after behaviour occurs to influence its frequency

  • Reward

  • Punishment

  • Escape

  • Omission

7
New cards

IC/Reinforcer

Reward Training

Presentation of positive reinforcer

(↑ frequency)

8
New cards

IC/Reinforcer

Punishment

Presentation of negative reinforcer

(↓ frequency)

Controversial, ethics of inflicting fear. Causes classical conditioning fear of authority figure

9
New cards

IC/Reinforcer
Escape

Removal of negative reinforcer

(↑ frequency)

10
New cards

IC/Reinforcer
Omission

Removal of positive reinforcer

(↓ frequency)

11
New cards

IC

Reinforcer Timing

Correct timing of reinforcer is critical; more effective if minimized delay

<p>Correct timing of reinforcer is critical; more effective if minimized delay</p>
12
New cards

IC

Acquisition of Conditioning

Visualized using a cumulative recorder

  • Flat horizonal = no response

  • Increase = response

Pattern depends on the participant, behaviour complexity, and reinforcer used

<p>Visualized using a cumulative recorder</p><ul><li><p>Flat horizonal = no response</p></li><li><p>Increase = response</p></li></ul><p>Pattern depends on the participant, behaviour complexity, and reinforcer used</p>
13
New cards

IC

Autoshaping

Learn contingency without guidance, can only be simple

14
New cards

IC

Shaping

Learn contingency with guidance through successive approximations, reinforce smaller behaviour to eventually build to wished behaviour

15
New cards

IC

Chaining

Learn to connect series of actions together, reinforced by providing opportunity to perform next sequential behaviour and given positive reinforcer after finishing

16
New cards

IC

Shaping vs Chaining

Shaping reinforces for improvement

Chaining reinforced for correct order

17
New cards

IC

Discriminative Stimulus

Signals validity of response-reinforcer contingency

18
New cards

IC/Discriminative Stimulus

SD/S+

Signals contingency is valid

i.e. being at parents house → eating vegetables = dessert

Can also be generalized

19
New cards

IC/Discriminative Stimulus

Sδ/S-

Signals contingency is invalid

i,e being at grandparents’ house → eating vegetables ≠ dessert

20
New cards

IC/Discriminative Stimulus

SD/S+ Generalization Gradient

SD/S+ can be generalized, stimuli similarity will affect rate of response. Must exist in same modality (existence)

i.e. pigeon will peck button when light is green SD/S+, but also sometimes at light with similar wavelength

<p>SD/S+ can be generalized, stimuli similarity will affect rate of response. Must exist in same modality (existence)</p><p>i.e. pigeon will peck button when light is green SD/S+, but also sometimes at light with similar wavelength</p>
21
New cards

IC/Discriminative Stimulus

Sδ/S- Stimulus Discrimination

Sδ/S- will constrict range of generalization gradient. Training with Sδ/S- is better for fine tuning. Must exist in same modality (existence)

i.e. pigeon will not peck button when light is red Sδ/S- but also sometimes at light with similar wavelength

<p>Sδ/S- will constrict range of generalization gradient. Training with Sδ/S- is better for fine tuning. Must exist in same modality (existence)</p><p>i.e. pigeon will not peck button when light is red Sδ/S- but also sometimes at light with similar wavelength</p>
22
New cards

CC & IC

CS+ vs SD/S+

CS+ reflexive, involuntary

SD/S+ sets occasion for voluntary

23
New cards

CC & IC

CS- vs Sδ/S-

CS- predicts absence of US

Sδ/S- establishes no reinforcer

24
New cards

IC

Reinforcement Schedules

Rules that dictates when reinforcement will occur

  • Continuous reinforcement

  • Partial reinforcement

Partial is more robust compared to continuous, because its less obvious that reinforcement ceased

  • Same reason why VR-# is better than FR-#

25
New cards

IC/Reinforcement Schedules

Continuous Reinforcement (CRS)

Every response leads to reinforcement, very rare in real world

26
New cards

IC/Reinforcement Schedule

Partial Reinforcement Schedule (PRS)

Responses are only reinforced sometimes, based on

  • Ratio vs Interval

  • Fixed vs Variable

27
New cards

IC/Partial Reinforcement Schedule

Ratio (R)

Based on number of responses

  • FR-1 is rewarded every 1

  • FR-10 is rewarded every 10

28
New cards

IC/Partial Reinforcement Schedule

Interval (I)

Based on time since last reinforcement

  • FI-1 is rewarded every 1 minute

  • FI-10 is rewarded every 10 minutes

29
New cards

IC/Partial Reinforcement Schedule

Fixed (F)

Requirement for reinforcement is constant across trials

  • Fixed FI-10 is rewarded 10 pecks every trial

30
New cards

IC/Partial Reinforcement Schedule

Variable (V)

Requirement for reinforcement is random but averaged across trials

  • Variable VR-10 is rewarded an average 10 pecks across trials

    • Trial 1: 12 pecks

    • Trial 2: 8 pecks

    • Average: 10 pecks

31
New cards

IC/Partial Reinforcement Schedule

Basic Partial Reinforcement Schedules

Four basic reinforcement schedules based on Ratio vs Interval & Fixed vs Variable

  • FR-#

  • VR-#

  • FI-#

  • VI-#

32
New cards

IC/PRS/Basic Schedules

Fixed Ratio FR-#

Reward every # response(s)

  • Ratio Strain

  • Pause & Run Cumulative Pattern

33
New cards

IC/PRS/FR-#

Ratio Strain & Break Point

Limit to how stingy you can be with your required amount of responses before responses stop (reach break point)

i.e. FR-500, too many responses required and responder will stop

34
New cards

IC/PRS/FR-#

FR-# Cumulative Pattern

Pause & Run

After reinforcement, participant will pause before resuming next run

<p>Pause &amp; Run </p><p>After reinforcement, participant will pause before resuming next run</p>
35
New cards

IC/PRS/Basic Schedules

Variable Ratio VR-#

Reward on average every # responses

  • Linear Cumulative Pattern

↓ ratio for reward = ↑ response rate

36
New cards

IC/PRS/VR-#

VR-# Cumulative Pattern

Linear

Slope is average number of responses before reinforcement. ↑ reinforcement frequency = ↑ response rate

  • VR-10 has steeper slope than VR-40

<p>Linear</p><p>Slope is average number of responses before reinforcement. ↑ reinforcement frequency = ↑ response rate</p><ul><li><p>VR-10 has steeper slope than VR-40</p></li></ul><p></p>
37
New cards

IC/PRS/Basic Schedules

Fixed Interval FI-#

Reward on every # time

  • Very rarely seen in real world

  • Scallop Cumulative Pattern

i.e. Weekly quizzes: start of week = less stress, stress ramps up, after quiz = no stress

38
New cards

IC/PRS/FI-#

FI-# Cumulative Pattern

Scallop 

After reinforcement, there is period where responses drop & slowly ramps up, peaking before next reinforcement

  • Does not want to miss reinforcement

  • No early reward if respond early

<p>Scallop&nbsp;</p><p>After reinforcement, there is period where responses drop &amp; slowly ramps up, peaking before next reinforcement</p><ul><li><p>Does not want to miss reinforcement</p></li><li><p>No early reward if respond early</p></li></ul><p></p>
39
New cards

IC/PRS/Basic Schedules

Variable Interval VI-#

Reward on average every # time

  • Reinforcement can come anytime but have an idea of when, so respond at average rate to ensure they will not miss reinforcement

  • Dashed Linear Cumulative Pattern

↑ average reinforcers = ↑ response rate

40
New cards

IC/PRS/VI-#

VI-# Cumulative Pattern

Dashed Linear

Usually respond in steady rate to ensure that they don’t miss reinforcement

<p>Dashed Linear </p><p>Usually respond in steady rate to ensure that they don’t miss reinforcement</p>
41
New cards

IC/Textbook

Primary Reinforcer

Reinforcer that satisfies biological need; unconditioned

42
New cards

IC/Textbook

Secondary Reinforcer

Reinforcer that gains value through association with primary reinforcer; conditioned

43
New cards

IC/Textbook

Delay of Gratification

Ability to tolerate delay between reinforcement, increases as humans age

44
New cards

IC/Textbook

Contrast Effect

Change in reward value → change in response rate

45
New cards

IC/Textbook/Contrast Effect

Positive Contrast

↑ reward value = ↑ response rate

46
New cards

IC/Textbook/Contrast Effect

Negative Contrast

↓ reward value = ↓ response rate

47
New cards

IC/Textbook

Overjustification

Reward for a previously unrewarded task alters perception of that task. Remove reward = response rate drops below that of unrewarded response rate

48
New cards

IC/Textbook

Observational Learning

Children learn through imitating observed behaviour

49
New cards

IC/Textbook

Mirror Neurons

Neurons that activate the same way when performing, observing, or imagining action

50
New cards

IC

Watson

Behaviourist: nurture over nature, children’s environment is more important

51
New cards

IC/Behaviourism

Biological

Predisposition to learn by imitating (i.e. baby learns to stick tongue out)

52
New cards

IC/Behaviourism

Cultural Transmission

Socially transmit/imitate behaviour (i.e. macaque learns to wash sweet potato by watching another)