TF

Lecture Notes on Visual Perception, Visual Search, and Perceptual Biases (Vocabulary)

Cone types and color vision

  • Humans have three cone types critical for normal color vision
    • S-cones peak at 420\,\text{nm} (blue)
    • M-cones peak at 534\,\text{nm} (green)
    • L-cones peak at 564\,\text{nm} (red)
  • The three-cone system explains color perception across the visible spectrum; missing a cone type leads to various forms of color blindness (referenced as discussed last week)
  • Raw spectral sensitivity vs. cone-driven sensitivity:
    • The raw sensitivity function peaks at 498\,\text{nm} (dotted line on the plot)
    • In practice, any given wavelength excites all three cone types to different extents; for example, light at around 450\,\text{nm} would evoke strong blue cone response, some green, and very little red
  • Key takeaway for exams (bare bones):
    • There are three cone types; normal color vision requires all three
    • A given wavelength excites the cones to varying degrees; you don’t need to perform detailed calculations of each cone’s contribution for course purposes
  • Practical implication: color blindness arises when one cone type is missing or nonfunctional

Visual field, eccentricity, and field extent

  • Visual field is the spatial extent you can see, described in degrees of visual angle
  • Eccentricity = how far a location is from the fovea (the point you’re directly looking at)
  • Example: holding a thumb at arm’s length, a flick over to the left at about 45{-}50^\circ is a typical eccentricity reference
  • Horizontal field extent varies across people: roughly 180{-}190^\circ (subject to individual differences and glasses/contacts)
  • Vertical field extent is typically 130{-}140^\circ
  • Pupillary distance (interpupillary distance) varies widely and influences horizontal field extent:
    • Example PDs: 65\,\text{mm} (mid-range) vs 52\,\text{mm} (smaller)
    • Closer eyes (smaller PD) can reduce horizontal field; wider separation (larger PD) can increase it
  • Visual field measurements are size- and distance-independent (degrees of visual angle) rather than literal centimeters at a distance
  • Horizontal vs vertical field differences and individual anatomy mean there is no single perfect number for everyone

Photoreceptor distribution and density across the retina

  • The retina contains a central region (fovea) with high cone density; density never drops to zero across the retina
  • The fovea is the region of the retina with the highest cone density; there are no rods in the fovea
  • Rod distribution:
    • Rod density increases from the fovea and peaks around 20^\circ to 30^\circ away from the fovea
    • This supports better scotopic (low-light) vision and peripheral sensitivity
  • Peripheral photoreceptors and edge density:
    • There is a slight increase in photoreceptors at the far edge of the retina in some cases
  • Important concepts linked to limitations:
    • Photoreceptor density is not uniform; this nonuniformity underlies why performance varies across the visual field and drives certain perceptual biases
  • The optic disc (blind spot): where axons exit the retina; the brain fills in this gap so you don’t notice a blank spot
  • Practical takeaways for vision tasks: density differences explain why acuity is highest in the center and lower peripherally, and why low-light or motion tasks rely more on rods

Peripheral color, agnosticism to strict template theory, and feature-based processing

  • Peripheral vision does carry color information; color processing is not limited to the fovea
  • Template theory vs. feature-based representation:
    • Template theory posits that the brain stores a large set of templates for every possible object/version, which is energetically costly and inefficient
    • Feature-based (or deconstructive) representation: early visual areas code features (e.g., orientation, color, terminations) rather than whole objects; objects are constructed from features in higher-level cortex
  • Why the brain favors feature-based representations:
    • It is computationally efficient and scalable; supports robust object recognition under occlusion and variation
  • Adaptation and after-effects:
    • Adaptation fatigues neurons representing a stimulus; the brain’s representation shifts, producing aftereffects
    • In this course, adaptation effects yield a predictable negative after-effect (opposite direction of the initial percept)
  • Implications for artificial vision:
    • Computer vision systems trained on biased data can inherit human-like biases; this can lead to misidentifications, especially across different racial groups
  • The “two streams” idea and integration with perception:
    • Visual processing involves dorsal (where/how) and ventral (what) streams; they interact rather than being completely separate
  • Relevance to questions and exams:
    • Expect discussion of why near-term visual processing uses features and how adaptation reveals feature-level encoding in early visual areas

Object recognition, agnosias, and the biology of perception

  • Agnosias: failures to identify objects, with partial or complete forms (e.g., prosopagnosia for faces, or other agnosias)
  • Genetic vs. acquired causes:
    • Agnosias can be caused by brain damage or by genetic variation/malformation; they exist on a spectrum of severity
  • Object reconstruction in the brain:
    • The brain combines early visual features into more complex representations in higher-level cortex
    • The basic premise for exams: object reconstruction is generally correct in everyday function, but there are known failures
  • Caveat and future topic:
    • The full, nuanced account is provided by feature integration theory (anticipate coverage in next week’s recorded lecture)
    • If curious about failures in feature conjunctions or integration, reach out for more readings
  • Real-world relevance:
    • Case discussions in class illustrate how perception can differ across individuals and contexts
    • The potential for bias and misinterpretation in both humans and AI systems is a practical concern in real-world applications

Visual search and attention: pop-out vs conjunction search

  • Visual search is a classic method to study attention and cognition
  • Pop-out (feature) search:
    • Target defined by a single diagnostic feature (e.g., color)
    • Independent of set size: reaction time does not scale with the number of distractors
    • Example: find the red item among a set of non-red distractors; set size does not affect search time
    • Interpretation: parallel search; the brain can extract the feature across the entire scene without serial inspection
  • Conjunction search (multiple features):
    • Target defined by a conjunction of features (e.g., color and shape)
    • Reaction time increases with set size; serial processing is required to combine features
  • Set size effects and what they imply about representation:
    • A flat reaction-time slope with set size in pop-out suggests pre-attentive, feature-level processing across the visual field
    • Positive slopes in conjunction search imply serial, attention-demanding processing and integration of multiple features
  • Practical relevance for driving and real-world tasks:
    • Real-world scenes require combining multiple cues (color, orientation, motion) to locate objects of interest
    • The efficiency of feature search supports the idea that some information can be processed rapidly without eye movements
  • A brief digression on task demands and information economy:
    • The brain often represents only the information necessary to complete a task (minimize information and energy expenditure)
    • Task demands modulate how long you need to look and what information is essential

Perceptual biases, heuristics, and the psychology of everyday vision

  • Perceptual biases and heuristics are shortcuts the brain uses to cope with a complex world
  • They are neither inherently good nor bad; they are energy-saving strategies that can be beneficial or lead to errors
  • Satisficing and inference:
    • The brain tends to use the most likely interpretation given prior experience rather than reconstructing the world from first principles
    • Unconscious inference (Helmholtz) and the likelihood principle explain why you perceive certain interpretations as more probable
  • Common perceptual rules and when they fail:
    • Proximity, similarity, closure, good continuation, and common fate guide grouping and perception of coherent objects
    • Proximity: closer elements tend to be grouped as a unit
    • Similarity: elements with shared features (e.g., luminance) group together
    • Closure: we perceive complete shapes even when contours are incomplete
    • Good continuation: we expect contours to continue smoothly behind occluders
    • Common fate: objects moving together are perceived as a group
    • Pragnanz (simplicity): the brain prefers the simplest interpretation of a complex scene
  • Illusions that reveal bias in cue integration:
    • Escher-like images (reversing cube, stairs) show how cues can mislead when structures are manipulated
    • Light-from-above assumption: we assume light sources come from above; this shapes interpretation of shadows and depth
    • Oblique effect: people are more sensitive to horizontal and vertical orientations; perception is less precise for oblique angles
  • Faces and upright bias:
    • We are particularly good at recognizing upright faces; upside-down faces disrupt usual recognition (e.g., Thatcher illusion)
    • Thatcher illusion shows how flipping internal facial features disrupts perception when the face is upright, but is less noticeable when inverted
  • Semantic vs syntactic violations (scene grammar):
    • Semantic violations: objects in contexts where they are not plausibly placed (e.g., toilet paper in a dishwasher) – still physically possible
    • Syntactic violations: objects in physically implausible positions (e.g., toilet paper floating in midair) – violates physical constraints
    • These violations reveal learned scene grammar and expectations about where objects belong
  • Seed grammar and environment expectations (Melissa Vogue, scene semantics):
    • People have lifetime experience with natural scenes; some objects are semantically constrained to certain environments
    • Violations reveal the brain’s learned priors about scene structure and object placement

Seed grammar, environment plausibility, and real-world implications

  • Semantic violations describe objects in inappropriate environments (e.g., a toilet paper roll in a dishwasher) while keeping physical possibility
  • Syntactic violations describe physically implausible placements (e.g., rolled toilet paper floating in midair)
  • The point: scene grammar reflects learned expertise about typical environments and how objects should appear
  • Practical takeaway:
    • Our perception relies on long-term experience to infer plausible scene structure; violations help reveal the underlying priors
  • Everyday examples discussed:
    • A water bottle on a professor’s head is physically possible but contextually unlikely; still a plausible localization within a scene, so treated as a semantic violation rather than a syntactic impossibility
  • The role of context in perception:
    • Knowledge of scenes and semantics shapes interpretation even when features are ambiguous
    • Cats in unusual places can violate scene grammar more readily than humans; the brain uses prior knowledge to judge plausibility

Real-world and logistical notes pertinent to coursework

  • Visual Search Lab and relevance to research reports:
    • The visual search tasks (e.g., cat among owls) relate to attention and to what you will write about in Research Report 1
  • Course structure and assessment guidance:
    • WPQs (weekly practice questions) are open-book; they are to guide study, not identifiers for exam questions
    • Exam questions will focus on lecture content and slides; textbook content unrelated to lectures may not be tested
    • Key terms list released after week 6 provides guidance on important material
  • Study strategy guidance (for students):
    • Use lecture content and slides as primary sources for exams
    • If textbook material appears, treat it as supplementary context unless explicitly tied to exam content
  • Discovery Labs and research opportunities:
    • Discovery Labs are due by the 25th (approx. one week from the date of the talk); two labs are required for credit
    • Labs produce group data used for class-wide analyses; you can revisit labs after the due date for review
  • Research reports:
    • Two lab-style reports; the first due on October 1
    • Structure: describe the stimulus/task clearly, report findings, discuss patterns and how group data compare with personal data
    • No stats are required; focus on understanding and interpretation
    • An opportunity to revise and resubmit the first report in mid-to-late October for a fresh mark
  • Getting involved in research:
    • Begin with lab websites and lab posters to identify interest areas
    • When emailing a PI, show you’ve looked at their work and propose a thoughtful discussion rather than a generic request
    • Use the Research Opportunity Program (ROP) application window (opens in February) to join labs for summer and subsequent terms
    • The department uses SONA for participant recruitment; consider joining studies as a participant to gain firsthand research experience
  • Practical tips for communications with faculty:
    • Avoid casual chatty language; provide specifics about interests aligned with the lab’s work
    • Do not propose a project outright; express interest and readiness to engage in ongoing work
  • Additional notes on course logistics (brief):
    • Slides are provided in a single format (pre-lecture and post-lecture updates with WPQ answers)
    • The Discovery Labs guide contains red-highlighted critical instructions and troubleshooting tips; read it carefully
    • If you miss a break or arrive late, ask during the break to pick up a handout or resource
    • The instructor emphasizes a balance between curiosity, rigor, and the practicalities of research engagement

Quick recap of the week’s big ideas

  • Perceptual biases and heuristics: shortcuts the brain uses to manage the abundance of information
  • Unconscious inference and the likelihood principle: perceptual priors built from lifetime experience influence what we see
  • Gestalt grouping rules: proximity, similarity, closure, good continuation, common fate, and the Pragnanz principle guide how we perceive scenes as coherent wholes
  • Object perception is a two-stage process: early feature representations feed higher-level object representations
  • Visual search reveals how feature-based and conjunction-based processing work, and how task demands shape information needs
  • Scene grammar and semantic/syntactic violations reveal learned expectations about where objects belong and how scenes should be structured
  • The relationship between perception and action: speed-accuracy tradeoffs and the information needed to perform tasks (e.g., driving scenarios)
  • The importance of recognizing biases in both human and machine vision systems and the social implications (e.g., bias in AI and real-world applications)
  • Practical coursework and research pathways to deepen understanding through labs, reports, and active engagement with faculty and labs