Development of Vision for Action

1. Introduction to Vision for Action

  • Importance and Complexity:

    • Vision for action involves sensorimotor transformation: integrating object positions in retinal space with limb positions in bodily space.

    • Degrees of Freedom Problem:

    • There are infinite motor solutions (joint positions, trajectories) for any goal.

    • Success requires evaluating one's own visuomotor capabilities accurately.

  • Neural Substrates:

    • Primary Driver: Dorsal Stream (“where?” and “how?” pathway). — posterior parietal cortex

    • Contrast: Ventral Stream (“what?” and “who?” pathway for recognition). — inferior temporal cortex

    • The dorsal stream is often considered developmentally vulnerable.


2. Frames of Reference (FOR) and Localizing Touches

2.1 Coordination Frames

  • Egocentric (Internal):

    • Retina-centered: Location relative to the eye.

    • Head-centered: Retina-centered + proprioceptive signals of eye-in-head.

    • Body-centered: Head-centered + signals of head-on-trunk.

  • Allocentric (World/Object-centered):

    • Objects relative to each other or the environment (e.g., teapot vs. cup).

    • Requires taking oneself “out of the equation.”

2.2 Developmental Progression of Touch Remapping

  • In Utero/Infancy:

    • Infants perceive tactile and auditory stimuli but cannot initially correlate them with visual input.

  • Buzzing Hands Study:

    • 66 months: Infants can only orient toward a stimulated hand if the arms are uncrossed.

    • 1010 months: Infants can orient to the correct hand even if arms are crossed. This reflects the recoding of touch into an external frame of reference.

    • Neural Correlate: Differences in somatosensory cortex EEG signals appear at 1010 months but not at 6.56.5 months, indicating enhanced modulation and remapping of S1.

2.3 Temporal Order Judgment Task

  • Task Procedure:

    • Hands are tapped (crossed vs. uncrossed) almost simultaneously with a slight difference in timing. The participant must determine which hand was tapped first.

    • It is consistently easier to judge temporal order when hands are uncrossed; this effect persists even when eyes are closed, indicating the process of bringing body-centered tactile input into visual space is highly automated.

  • Developmental Milestones:

    • Children only begin to consistently show the “crossed hands effect” around age 66.

    • This suggests a protracted development for remapping touch from body-centered to visual coordinates, potentially limited by the slow maturation of cross-hemisphere communication (e.g., the corpus callosum).

  • The Role of Visual Experience:

    • Congenitally Blind: These individuals do not show the “crossed hands effect,” suggesting sight is necessary for the initial development of remapping tactile inputs to external space.

    • Late Restored Sight (Case LM): Unlike sighted controls, LM is not impaired when hands are crossed, performing equally well in both conditions.

    • Data Sensitivity: A steeper psychometric curve indicates greater sensitivity to timing differences. At the 00 point (simultaneous touches), participants should not be able to distinguish order.

      null
  • Only at 10 months do infants orient to the correct side when they feel a buzz on their hand if arms are crossed

  • This suggests body posture information is not yet recoded into external/visual coordinates as in adults at 6.5 months

  • Between 6.5 and 10 months infants learn to keep better track of where their body parts are in visual space

  • This ability might continue to develop until at least ~age 5 years in childhood and require visual experience


3. Learning Which Objects are Within Reach

  • Binocular Depth Cues:

    • Stereo vision develops between 2.5 and 5 months.

    • 55 months: Infants begin reaching for the nearer toy based on binocular disparity, even if retinal size is equated (closer toy smaller).

  • Perspective (Pictorial) Cues:

    • Infants reach preferentially toward the “near” side of a pictorial illusion only at 6 to 7 months. (e.g., Ames window)

  • Despite disparity depth perception being available from around 14 weeks, they seem not to incorporate it yet in their behavior (systems may not be linked up yet)

  • So 3D vision from disparity and perspective places constraints on children’s reaching interactions with the world before the 1st half year of life


4. Graspability, Affordances, and Planning

4.1 Salience vs. Graspability

  • Faces: Highly salient for infants. Visual saliency often dominates over graspability in younger groups.

4.2 Affordances (Ecological Approach)

  • Definition: The world is perceived in terms of possible actions for the individual (e.g., a handle affords holding).

  • Gibson’s ecological approach (1979): States that the world is perceived in terms of its possible actions for the individual.

    • Direct Recognition: The graspable element of an object is directly recognized without deliberate information processing.

    • Examples: A chair “affords” sitting; a handle “affords” holding.

  • Action Compatibility: Affordance is the part of the object that allows you to perform an action. It depends on the interaction between visual attributes and the observer's:

    1. Body (physical capabilities).

    2. Experience (learned associations).

    3. Goals (e.g., if we are tired, many surfaces start looking like a chair).

  • Affordance Processing — Sensory-Motor Theory of Conceptual Formation:

    • Adults activate grasping regions, specifically the Anterior Intraparietal Sulcus (AIP) and ventral Premotor Cortex (vPM), upon simply viewing a familiar utensil.

    • This activation occurs passively, even without the intent to act.

    • This implies that viewing objects automatically triggers a motor plan to grasp them, aligning with the recognition of Gibsonian affordances.

  • Modulation by Knowledge: Affordances are not just automatic but can be modulated by object knowledge.

    • Spatula Study: Subjects were asked to pick up a spatula (head toward them, handle away).

      • Even though the head is easier to grab, most subjects grabbed the handle (typical use).

      • Individuals performing a semantic distractor task were much less likely to grab the handle compared to those doing a spatial distractor task. This suggests semantic tasks interfere with access to the object knowledge required to recognize functional affordances

  • Neural Circuitry:

    • Anterior Intraparietal Sulcus (AIP): Converts object shape into motor grasp responses. Contains motor, visual, and visuomotor neurons.

    • Premotor Cortex Area F5 (Broca’s area): Receives projections from AIP for selecting action sequences and motor planning.

nullnull

4.3 Development of Affordance Processing

  • 55 months: Infants show “pre-pincer” hand shaping for small objects despite lacking the coordination to execute the grasp. (only do it at 8-9 months)

  • 66 years+: Passive viewing of tools activates AIP.

  • The Cup Task and Inhibition: Children struggle to inhibit the “potentiated grasp” when handles are task-irrelevant, requiring more frontal cortex recruitment for suppression.

  • Scale Errors: Toddlers attempt to perform correct actions on objects that are the wrong size (e.g., trying to sit in a tiny chair).

  • Inhibition: Children struggle to inhibit the “potentiated grasp” when handles are task-irrelevant, requiring more frontal cortex recruitment for suppression.

null

4.4 End-State Planning

  • Definition: Planning a movement by starting in an uncomfortable position (awkward grip) to ensure the movement ends in a comfortable terminal state (end-state comfort).

  • Developmental Milestones:

    • Children under age 3 struggle significantly with this, usually opting for immediate comfort.

    • The ability begins to emerge around age 3.5, though it is not fully developed and remains difficult for young children.

    • This specific motor planning capability is not considered very trainable

4.5 Which objects should I reach for and how?

  • Dorsal Stream Circuitry: The dorsal stream contains dedicated neural circuitry specifically designed to transform visual inputs into appropriate motor actions.

  • Developmental Activation: Grasp-related and category-selective brain activation during the passive viewing of tool pictures reaches adult-sized levels from age 6 onwards.

  • Vision-to-Grasp "Blue Path":

    • Correct pre-shaping of the hand in young infants solely based on visual information suggests that the transformation pathway (the "blue path") is present early.

    • While this neural path may exist, it might not always manifest consistently in functional behavior.

  • Conflict and Resource Recruitment:

    • Children get easily distracted by attractive affordances when they conflict with the task at hand.

    • Children under 3 years old struggle to adopt awkward initial grips to achieve end-state comfort.

    • Older children (under 7 years) still need to recruit additional neural resources to ignore the grabability of familiar objects when necessary.


5. Building Models of the Body in Action

5.1 Noise and Uncertainty

  • Sensory and motor systems are subject to “noise” (e.g., fog while judging speed of car). Optimal performance requires minimizing variance by combining sensory sources and judging risk.

5.2 Visuomotor Decision-Making

  • Adults: Optimize performance by accounting for their own system uncertainty.

  • Children (under 1111 years): Choose suboptimal strategies.

    • They are not necessarily just imprecise; they fail to compensate or optimize for their own noise.

    • Children tend to aim too close to penalty regions (“risk-taking”), playing for high stakes at a greater risk of losing.

1. Developmental Milestones
  • 2.552.5 - 5 months: Development of binocular depth cues (stereo vision).

  • 55 months: Infants begin reaching for objects based on binocular disparity; appearance of “pre-pincer” hand shaping for small objects.

  • 66 months: Infants can only orient toward a tactile stimulus (buzzing) if their arms are uncrossed.

  • 676 - 7 months: Infants begin to reach preferentially toward the “near” side of pictorial illusions (e.g., Ames window).

  • 898 - 9 months: Infants develop the coordination required to execute a pincer grasp.

  • 1010 months: Infants successfully orient to the correct limb even when arms are crossed, reflecting the remapping of touch into an external frame of reference.

  • Age 33.53 - 3.5: The ability to plan movements for end-state comfort (awkward initial grip for a comfortable finish) begins to emerge.

  • Age 66: Consistently demonstrate the “crossed hands effect” in temporal order judgments; adult-sized neural activation during passive viewing of tool pictures.

  • Under 1111 years: Children utilize suboptimal visuomotor strategies, failing to account for their own system noise or uncertainty.

2. Assessment Paradigms
  • Buzzing Hands Study: Used to test when infants can recode tactile input into an external frame of reference by crossing their limbs.

  • Temporal Order Judgment (TOJ) Task: Participants judge which of two hands was tapped first in crossed and uncrossed conditions to evaluate tactile-to-visual remapping.

  • Pictorial Illusion Reaching (Ames Window): Assessment of whether infants use perspective cues to guide their reaching behavior.

  • Spatula Study: Investigates functional affordances by asking subjects to pick up a tool with semantic or spatial distractors.

  • The Cup Task: Measures the ability to inhibit “potentiated grasps” when an object handle is task-irrelevant.

  • Scale Error Observations: Monitoring instances where toddlers attempt to perform actions on objects that are the incorrect size (e.g., sitting in a tiny chair).

  • End-State Planning Task: Evaluating if a child will adopt an uncomfortable initial posture to ensure a comfortable final position.

  • Visuomotor Decision-Making Tasks: Analyzing aiming strategies relative to penalty regions to determine if an individual accounts for sensory and motor noise.

3. Neural Substrates
  • Dorsal Stream: Located in the posterior parietal cortex; the “where/how” pathway that drives vision for action.

  • Ventral Stream: Located in the inferior temporal cortex; the “what/who” pathway for object recognition.

  • Somatosensory Cortex (S1): Shows signals of enhanced modulation and remapping at 1010 months during tactile location tasks.

  • Anterior Intraparietal Sulcus (AIP): Converts object shape into motor grasp responses; contains visuomotor neurons.

  • Ventral Premotor Cortex (vPM / Area F5): Receives input from the AIP to select action sequences and facilitate motor planning.

  • Frontal Cortex: Recruited for the inhibitory suppression of automatic motor plans (affordances).

  • Corpus Callosum: Slow maturation limits the cross-hemisphere communication necessary for remapping body-centered input into visual space.