Foundations of Visual Perception – Colour, Objects, Depth & Size

The Eye and Visual Pathways

  • The Eye: components involved in vision

    • Structures: iris, pupil, lens, cornea, retina (including fovea and blind spot), sclera, vitreous humour, optic nerve, eye muscles, ciliary muscle.

    • The retina as sense organ for vision; light enters the eye and strikes the retina; photoreceptors respond; signals are sent via nerve fibers toward the brain through the optic nerve.

    • Rods and cones: receptor types in the retina. Rods are more sensitive in low light; cones support color vision.

  • Path of light and neural signals 1) Light enters the eye and reaches the retina. 2) Rods & cones are stimulated by light. 3) Impulses travel via retinal ganglion cells to the optic nerve and then to the brain.

    • Pigmented cells and ganglion cells contribute to signal processing before transmission to the brain; L-cones (long-wavelength sensitive cones) are part of color processing.

  • Visual field transmission to brain

    • Right visual field (RHF) projects to the left hemisphere, reaching the LGN and visual cortex.

    • Left visual field (LHF) projects to the right hemisphere LGN and visual cortex.

  • Visual processing hierarchy

    • Preprocessing by ganglion cells and LGN occurs before the primary visual cortex does more complex processing.

    • After the primary visual cortex, information is sent to temporal and parietal lobes for further processing (e.g., object recognition and spatial processing).

  • Conceptual framing for perception

    • How do we perceive the world? Consider:

    • Sense organs (e.g., eyes) and what information they gather (e.g., binocular disparity cues for depth).

    • Brain mechanisms to exploit that information (e.g., neural networks sensitive to disparity).

Colour Perception: Wavelengths, Hue, Brightness, and Saturation

  • Physical properties of light

    • Wavelength: distance between peaks of a light wave; different wavelengths correspond to different perceived colors.

    • Amplitude: related to brightness of the light.

    • Purity: relates to saturation of color.

    • Light visible to humans spans approximately 400–700 nm.

  • Visual spectrum relationships

    • Short wavelengths (≈ 400–500 nm): perceived as violet/blue.

    • Medium wavelengths (≈ 500–590 nm): perceived as green/yellow.

    • Long wavelengths (≈ 590–700 nm): perceived as orange/red.

  • Colour perception concepts

    • Hue: the qualitative color (what color it is).

    • Brightness (intensity): how bright the color appears.

    • Saturation: how pure the color is (how much white light is mixed in).

  • Light interaction with objects

    • Objects appear colored because they reflect certain wavelengths and absorb others.

    • Reflected wavelengths determine hue; brightness depends on the light reaching our eyes; saturation depends on the mixture with white light.

  • Additive vs subtractive colour mixing

    • Additive mixing (lights): combining colors of light (e.g., RGB) to produce new colors.

    • Subtractive mixing (paints): mixing pigments to absorb more wavelengths, producing other colors.

  • Theories of color vision

    • Trichromatic Theory

    • Proposes three types of cones in the eye: long-wavelength (red), medium-wavelength (green), and short-wavelength (blue).

    • Perceived color is determined by the relative activity levels across these three cone types.

    • Evidence: any visible color can be matched by adjusting proportions of red, green, and blue lights in color-matching experiments.

    • Supporting idea: RGB color mixing corresponds to cone type sensitivities.

    • Opponent Process Theory

    • Proposes color vision is encoded by opponent channels: red–green, blue–yellow, and black–white.

    • Explains phenomena that Trichromatic Theory struggles with, such as certain color afterimages and color blindness patterns.

  • Coloured afterimages

    • Afterexposure to a color can produce an illusion of the opposite color when looking away.

    • This supports the idea of opponent channels in color processing.

  • Reconciling theories

    • Both theories are needed:

    • The eye has three cone types (Trichromatic) for initial chromatic coding.

    • Neural processing in the retina, LGN, and cortex includes opponent channel dynamics (Red–Green, Blue–Yellow, Black–White).

    • The brain combines trichromatic input with opponent processing to yield perceived colors.

Object Perception: Perceiving Objects Against Backgrounds

  • Challenges for perception and artificial systems

    • Computers struggle with object detection/recognition in complex environments.

    • Problems include occlusion/partial views, varying distance/orientation, and the fact that the retina provides only a partial view at any moment.

  • Key perceptual problems

    • Problem 1: Partial views due to occlusion or overlap (edge ambiguity, artificial edges).

    • Problem 2: Distance and orientation cause the same object to look different across views.

    • Problem 3: Perception often requires inferring information not directly available on the retina.

  • Gestalt perspective on organization

    • Gestalt Psychology emphasized that perception is an achievement of the whole, not reducible to summing parts.

    • Core idea: the whole is different from the sum of its parts.

  • Gestalt grouping laws (principles for perceptual grouping)

    • Law of Proximity: elements close to each other tend to be perceived as a group.

    • Law of Similarity: similar elements are grouped together.

    • Law of Common Fate: elements moving in the same direction are grouped.

    • Law of Pragnanz (simplicity/good figure): perceptual organization tends toward simplicity.

    • Law of Good Continuation: the eye tends to follow smooth paths.

    • Law of Familiarity: familiar patterns are more likely to be grouped.

  • Figure–ground segregation

    • How we separate an object (figure) from its background (ground).

    • Cues include: symmetry, closure, area, orientation, and meaning.

    • Figures tend to have symmetry and closure; figures tend to occupy a smaller area relative to the ground; perceived meaning also influences status as figure.

  • Real-world relevance and critiques

    • Gestalt laws work well for many examples but not all; some situations reveal the limitations of these rules.

    • Criticisms include post-hoc explanations (they describe perception after the fact rather than predicting it) and difficulties in applying to new, unseen scenarios.

  • Examples in real imagery

    • Real-life images illustrate howGestalt laws interact with figure–ground segmentation, sometimes producing perceptual ambiguities.

Depth Perception: How We Perceive Distance and Depth

  • The challenge of depth from 2D retinal images

    • The world is 3-D, but retinal images are 2-D; depth cues help recover distance and structure.

  • Depth cues: monocular and binocular

    • Binocular cues rely on the two eyes and include binocular disparity and convergence.

    • Monocular cues can be used with one eye and include accommodation, motion parallax, interposition, relative size, linear perspective, texture gradients, height in the plane, and more.

  • Binocular disparity

    • Objects project to slightly different locations on the left and right retinas.

    • The closer the object is to the fixation point, the smaller its retinal disparity.

  • Convergence

    • A binocular distance cue depending on the inward turning of the eyes when focusing on near objects; the greater the near object, the more convergence required.

    • As distance increases, the required convergence angle decreases.

  • Accommodation

    • The curvature of the lens changes to focus on objects at different distances; more accommodation indicates nearer objects.

  • Motion parallax

    • When the head or observer moves side to side, nearer objects move more across the retina than distant objects.

  • Occlusion (interposition)

    • Near objects occlude parts of farther objects, providing depth information.

  • Relative size and linear perspective

    • Relative size: if two objects are the same size, the one that appears smaller is inferred to be farther away.

    • Linear perspective and converging lines provide cues to depth and distance.

  • Texture gradients and height in the plane

    • Textured surfaces become denser with distance; texture gradients inform about depth.

    • Height in the plane refers to objects higher in the visual field often perceived as farther away.

Size Perception: How We Judge Object Size

  • Visual angle as a determinant of perceived size

    • An object’s size on the retina (image size) relates to its real size and distance.

    • Visual angle θ is approximately given by: θSD\theta \approx \frac{S}{D} for small angles, where S is the actual size and D is the distance to the object.

  • Distance-size relationship

    • Two objects of equal physical size subtend different visual angles at different distances.

    • Two objects of different sizes can subtend the same visual angle if they are at different distances.

  • Eclipse example illustrating size constancy limits

    • During an eclipse, the Sun and Moon appear the same size in the sky even though they are vastly different in actual size and distance, because they subtend roughly equal visual angles: θSunθMoon0.5.\theta{Sun} \approx \theta{Moon} \approx 0.5^{\circ}.

    • Distances involved (approximate values from the source):

    • Moon distance: DMoon2.45×105 milesD_{Moon} \approx 2.45 \times 10^{5} \text{ miles}

    • Sun distance: DSun9.3×107extmilesD_{Sun} \approx 9.3 \times 10^{7} ext{ miles}

    • Sun diameter: DSun,diam8.654×105extmilesD_{Sun,diam} \approx 8.654 \times 10^{5} ext{ miles}

    • Moon diameter: DMoon,diam2.20×103extmilesD_{Moon,diam} \approx 2.20 \times 10^{3} ext{ miles}

  • Law of Size Constancy

    • We tend to perceive the physical size of objects as constant, despite changes in retinal image size due to distance changes.

    • However, size constancy can be violated by certain illusions and contextual cues.

  • Illusions related to depth and size

    • Moon illusion: horizon moon appears larger than the moon high in the sky.

    • Apparent Distance Theory explains this by suggesting horizon objects appear further away, increasing their perceived size when their visual angle remains the same.

    • Evidence: Kaufman & Rock showed that the horizon moon looked about 1.3 times larger when viewed with terrain present than when terrain was masked.

  • Ames Room and perceptual distortions

    • Ames Room demonstrates how context and room geometry distort perceived size and position of people in the room when viewed through a viewing hole.

Summary and Connections

  • Core idea: We visually perceive colors, objects, depth, and size by combining information from retinal images with prior knowledge and multiple visual cues.

  • Perception is often a best-guess reconstruction rather than a direct readout of the world, integrating multiple cues and interpretations.

  • Theories across colour, object, depth, and size perception often require complementary explanations (e.g., Trichromatic and Opponent Process in colour vision; Gestalt tendencies in organization; depth cues combining monocular and binocular information).

  • Real-world relevance and implications

    • Understanding perceptual processes informs user interface design, safety, and the interpretation of visual information in real environments.

    • Illusions reveal the constructive nature of perception and the brain’s reliance on contextual cues to infer reality.

  • Philosophical/practical takeaway

    • Perception as interpretation highlights the brain’s role in constructing reality, with practical implications for how we design environments and interpret sensory information.