Object Recognition and Visual Processing

Visual Information Processing in the Brain

Object Recognition: Low, Intermediate, and High Level Stages

  • Low-Level Processing:

    • Extraction of basic visual features like orientation, color, contrast, disparity, and movement.
    • Early visual processing areas (e.g., LGN, V1) are involved.
    • Receptive fields help identify bars and edges (orientation selectivity in V1).
  • Intermediate-Level Processing:

    • Combination of low-level features to build more complex representations.
    • Identification of contours, surface properties, and shape discrimination.
    • Segregation of objects based on depth information.
  • High-Level Processing:

    • Identification of objects (e.g., a horse) in a viewpoint-invariant manner.
    • Requires making sense of all extracted information to recognize objects regardless of perspective.

Computational Model of Recognition

  • Goal: To identify a particular object and build a representation that corresponds only to that object.
  • Steps:
    • Edge detection: Early receptor fields (retina, LGN) with center-surround organization.
      • Photoreceptor signals combine to create excitatory (ON) or inhibitory (OFF) responses.
      • Example: A light bar stimulating the ON region results in a burst of action potentials.
    • Orientation selectivity: Combining LGN neuron responses in V1 to achieve orientation-specific responses.
      • Example: Three LGN neurons with receptive fields oriented vertically combine to create a vertically oriented response in a cortical cell.
    • Curvature detection:
      • V4 neurons code for complex image properties in terms of surface shape.
      • Neurons are tuned to specific curvatures, which is useful for detecting complex shapes.
    • Combining simple cells from V1 with different orientation preferences can build curvature-selective responses in V4.

LGN, V1, and V4 Receptor Fields

  • LGN:
    • Circular center-surround receptive fields.
    • Respond to changes in light intensity.
      $Equation:$\ \text{Excitation} \rightarrow \text{Action Potentials}
      $Equation:$\ \text{Inhibition} \rightarrow \text{Reduction in Action Potentials}
  • V1:
    • Orientation-selective cells (simple cells).
    • Combine LGN neuron responses.
    • Respond to bars and edges of specific orientations.
      $Equation:$\ \text{Vertically Oriented Light Bar} \rightarrow \text{Strong Response}
      $Equation:$\ \text{Change Orientation} \rightarrow \text{Reduced Response}
  • V4:
    • Cells code for more complex image properties like surface shape and curvature.
    • Combining simple cells from V1 can build curvature-selective responses in V4.
      $Equation:$\ \text{Curve Providing Light Stimulus} \rightarrow \text{Excitatory Response}
      $Equation:$\ \text{Move Curve to the Left} \rightarrow \text{Inhibition}

Object Recognition in the Inferior Temporal Cortex (IT)

  • What pathway: Ventral stream.
  • IT cortex critically involved in neural responses that correspond to familiar objects.
  • Hierarchy of processing: V1 -> V2 -> V4 -> IT.
  • Low-level structural representations in V1.
  • Mid-level structural representations (contours) in V2/V4.
  • High-level structural representations (object recognition) in IT.
  • Synthesis of information about form, color, and depth.
  • Neurons in IT respond poorly to simple stimuli (spots, lines).
  • Large receptive fields, integrating information from larger retinal regions.
  • Responses tend not to change when an object moves or changes size within the receptive field (viewpoint invariance).

Damage to the Inferior Temporal Cortex: Object Agnosia

  • Visual object agnosia: Loss of ability to recognize familiar objects through vision.
  • Inferotemporal cortex active when humans look at objects vs. scrambled images.
  • Pattern of activation in IT determines what objects are perceived.
  • Different types of agnosia depending on the location of damage.
    • Can recognize and order shapes but can't identify same shapes.
    • Inability to copy or understand letters and their relation.
  • Recognition loss is modality-specific.
    • Can identify a key by touch but not by sight.

Types of Object Recognition and Agnosia

  • Structural Mechanisms:
    • Based on identifying features associated with an object (e.g., a chair has four legs and a back).
  • Holistic Processing:
    • Recognizing objects by processing all features simultaneously in the correct configuration (e.g., face recognition).
  • Two types of agnosia:
    • One for structural mechanisms.
    • One for holistic processing (face processing).
  • Faces as special stimuli:
    • Small changes in the configuration of facial features allow us to identify individual faces.
    • Even when faces are constructed from other objects (e.g., vegetables), the configuration can trigger face recognition.

Face Inversion Effect

  • Faces are typically processed in an upright configuration.
  • Less sensitive to overall configuration when faces are inverted.
  • Inversion effect indicates that faces are special and different from other objects.

Face Cells in the Monkey Inferior Temporal Cortex

  • Recording from single cells in the inferior temporal cortex.
  • Neurons respond to images of faces.
  • Greatest responses to faces with the correct configuration of features (eyes, nose, mouth).
  • Less response to scrambled or obscured faces.
  • Responses are typically greatest for images that look most like faces.

FMRI Responses in the Human Fusiform Face Area (FFA)

  • fMRI shows greater response to faces compared to scrambled images in a region of the temporal cortex.
  • Fusiform face area (FFA) is particularly responsive to faces.
  • Some debate about whether FFA is specific to faces or is an expert object recognition area.
    • Some evidence suggests that FFA might be activated when looking at images of things we might be an expert looking at.
    • More recent work suggests the FFA is strongly associated with faces.
  • Adjacent brain regions respond to objects, faces, body parts, and scenes.

Damage to the FFA: Prosopagnosia (Face Blindness)

  • Results in the inability to recognize faces.
  • Normal visual recognition of other objects.
  • The individual can still recognize people through other modalities (e.g., voice).

Recognition Memory for Objects and Natural Scenes

  • Good ability to recognize and remember scenes.
  • Good at identifying different scene categories.
  • Able to distinguish between a large number of examples or contexts of a given scene category.
  • Scene recognition might be a little like faces.
  • Natural scenes can be recognized at a high fidelity.

Scene Memory and Object Memory

  • Parts of the brain used for processing scenes and objects are not the same.
  • Scene perception operates independently of object perception.
  • Patient DF: Profound visual agnosia for objects but able to recognize scenes.
  • Parahippocampal place area (PPA): A brain region important for recognizing different places.