Class Notes
1/22/26
World → distal stimulus
Retinal Image → proximal stimulus
What is the function of vision? Is to establish internal representations of the external world such that we can successfully interact with it (the world).
The function of any sensory system is to establish internal representations of the external world, based on some physical source of information that reflects some aspects of the world so that we can successfully interact with it.
Psychometric function is different from person to person (threshold)
1/29/26
We tend to use rods for night vision.
The rod system has higher sensitivity
Works better in low light
Acuity → Ability to discriminate fine detail
Ability to distinguish fine detail, depending on how small it is
In bright light, cones are more sensitive than rods
Cones adapt pretty quickly and then bottom out
Oguchi’s disease (congenital) → No (functional) rods
Receptive Field → a property of the cell, but it is defined by location within the visual field
Every visual sensory neuron has a receptive field
part of the visual field
That part of the visual field to which a given neuron is sensitive
High acuity; low sensitivity
Small receptive fields so there is no ambiguity with regard to A alone, B alone, or A plus B (high acuity)
But any given ganglion cell is likely to get activated under low-light conditions (low sensitivity)
Ganglion cells are good at receiving edges
RGCs with center-surround RFs are edge detectors
02/03/26
Neurofibers leave the back of the eye → that is where the blind spot is
The primary visual pathway (geniculostriate) and secondary visual pathway (retinotectal) both start at the retina
PVP
Evolutionarily newer
underlies conscious perception
SVP
Evolutionarily older
mainly unconscious processing
Primary visual cortex (V1)
Retinotopic representation
The spatial relations are maintained. Establishing a spatial map of the cortex
Better defined in the earlier processing fields than the later processing fields.
Cortical magnification
The cortical map is the tissue dedicated to the region
Multiple maps & Increasing receptive field size
Taking the cortex and making it flat.
Each one of the V’s is a separate map of the visual field. (Multiple maps of the visual field slide)
The receptive field size gets larger as it goes on up from V1.
Functional selectivity
When neurons respond more strongly to some visual feature or property than to others, that cell effectively codes for the presence of that visual attribute at that particular location in the visual field
Fusiform Face Area (FFA) → is activated during face perception because it contains lots of individual neurons that are selective for configurations of stimuli corresponding to faces ( like RGCs are selective for edges)
(Rough) Functional selectivity
V1 → basic features
V4 → color, curvatures, and simple shapes
IT/TE and LOC → complex form
FEF/LIP → spatial attention, saccade control
V1 selectivity
Retinal ganglion cells are selective for edges
Firing rate changes when there is an edge in their RF, and does not change when there is not
The tuning curve describes selectivity for a given neuron
The tuning function describes the characteristics of that one cell
V1 cells have orientation selectivity
Information processing through selective convergence
A specific subset of RGCs converging onto a common V1 cell can create an orientation-specific V1 cell
Selective adaptation
Increased threshold for adapted orientation and NOT other orientations
Means there are mechanisms selective for that orientation
Humans adapt selectively to specific spatial frequencies as well as specific orientations.
Human Contrast Sensitivity Function (CSF)
The yellow region shows the part of that space where we can see edges and light information over space.
It is determined by underlying (measurable) spatial-frequency channels → these are essentially filters.
It tells us about
Acuity → the smallest spatial detail that can be resolved (depends on contrast)
Sensitivity → the lowest contrast that can be perceived (depends on spatial frequency)
2/5/26
Human Contrast Sensitivity Function (CSF) → a space of stimulation that the visual system can deal with.
The shape of it is telling the lowest amount depends on the frequency, and the acuity depends on the contrast
Can be defined without reference to each other
Spatial Frequency and Orientation
defined by orientation, spatial frequency, and contrast
Square-wave gratings → set of superimposed spatial frequencies at increasing spatial frequency and decreasing contrast
More complex than sine waves
To the visual system, it is very complex
Single sine waves are filtered by a single frequency channel
Fourier analysis → can decompose any 2D image into a sum of component sine waves (spatial frequency, contrast, orientation)
Bandpass-filters → only a range of spatial frequencies is passed through
Internal representations of the retinal image (related to the external world) in V1 is the pattern of activity across spatial-frequency and orientation
High contrast and low contrast produce the same image, but produce different spikes/sec
A low-contrast edge at the preferred orientation elicited the same response as a high-contrast edge at a less-preferred orientation. [Ambiguity in single-cell activity]
Edges at very different non-preferred orientations can produce the same response. [Ambiguity in single-cell activity]
Population Coding → representation in terms of patterns of activity across multiple cells (populations) with different selectivities reduces ambiguity
High contrast and low contrast can depend on how the cell responds to the stimuli
2/10/26
V2
Each visual field is retinotopically mapped
They are coding for different types of visual attributes
Needs edge information in the receptive field
V4
Individual V4 neurons are selective for curvature
If it gets curvature at the right amount, it will change its firing rate to the highest amount on the curve
Inner part is the on part and the outer part is the off part
it is receiving connections from a set of V1 cells (neural signals in image (purple bells)).
they connect within the area to other cells and begin to form other shapes
shape defined by population coding across curvature-selective cells
can help see more complex shapes
shapes are made up by a bunch of curves (A,b,c,d,e,f in Hershey kiss looking image)
IT/ITE
Faces
Complex-form selectivity
far in the visual processing hierarchy
Kobatake and Tanaka (1994) monkey
coding for configural representations
they can show each of the three features but misconfigured
found to code for complex things
population coding
MT/MST
motion selectivity
MT is the critical one
MT is coding for motion without the need for edges
“Pure” motion, no orientation edge needed
displays the motion without the need for an edge in its receptive field
MT/MT+
compared when having a moving stimulus, no edge, vs a stationary stimulus
Frontal Eye Fields (FEF) and Lateral Intraparietal Area (LIP) → Visually Guided Eye Movements and Attention
Functionally Distinct Pathways
Dorsal stream → V1 to parietal cortex
Extension of rod system (and magnocellular pathway of LGN)
Ventral stream → V1 to temporal cortex
Extension of rod system (and parvocellular pathway of the LGN)
Early What vs Where Evidence
Ungerleider & Mishkin (1982)
Double dissociation → one of two functions is damaged without harm to the other, and vice versa
Object discrimination → “what” task
Landmark discrimination → “where” task
Monkey with the ventral lesion was impaired with the what task but not impaired with the where task
Monkey with the dorsal lesion was impaired with the where task but not impaired with the what task
Later What vs How evidence
Milner & Goodale (1991)
Two tasks
Perceptual matching (What)
Posting (How)
Two patients
Ventral Damage (DF)
Dorsal Damage (RV)
Two Tasks
Reaching (How)
Same/different (What)
Same/Different Task (What)
DF was worse than RV who was equal to control
Reaching Task (How)
RV was worse than DF who was equal to control
The patient with dorsal damage (RV) performed poorly
The patient with ventral damage (DF) performs as well as controls
Specific Selectivities Reflect Ventral/Dorsal Functions
Retina → edges (cones, rods)
LGN (parvocellular, magnocellular)
V1 (parvo, magno) → basic visual features
V2 (parvo, magno) → basic visual features in context
V4 → color, curvature
Temporal areas (IT/TE) → complex shapes, faces/configurations
Ventral stream → What (object recognition)
MT → motion
LIP/FEF → eye movements. Visually guided grasping
Dorsal Stream → How? (visually guided action)
2/17/26
Object Perception
Perceptual Organization → Processes by which representations of image-based information (proximal stimulus) are transformed into representations that reflect scene structure (distal stimulus)
proximity, similarity, enclosure, symmetry, closure, continuity, connection, figure & ground
Components of Perceptual Organization:
Represent edges (image information)
Represent uniform regions bounded by edges (image information)
Different luminance levels reach the eye from different parts of the scene because of light reflecting off of surfaces with different reflectances
Mosaic → still an image-level representation
Border ownership/ figure vs. ground/ relative depths (beyond the image)
Edges separate regions
Not explicit, it has to be inferred
V2 has cells that are selective for specific border ownerships1
Function of selectivity for border ownership
Distinguish figure from (back)ground/ Assign relative depth
Completion- representing “inferred” parts of the scene (beyond the image)
group similar lines together
continue aligned edges (even if dissimilar)
enclose edges to define contiguous regions
relatable edges are completed; unrelatable edges are not
X-junction → transparency and different depths
T-junction → occlusion and different depths
L-junction → adjacent at same depth
Object Recognition → processes by which visual representations are matched to amodal semantic representations in memory
Scene Processing → understanding objects and their relations in context
Interdependence of components → completion depends on assigned border ownership
[Input] Image-based representations (e.g., luminance over space) → Perceptual Organization → [Output] Scene-based representations (e.g., surfaces and their spatial relations)
2/19/26
Recall:
Perceptual Organization → Processes by which representations of image-based information (proximal stimulus) are transformed into representations that reflect scene structure (distal stimulus)
Object Recognition → processes by which visual representations are matched to amodal semantic representations in memory

Theories of object recognition try to explain how that matching process occurs
Template theories are intuitive
Template Models
Point-for-point matching of input against stored representation (“lock and key”)

Problems with this simple template theory? → We would need an infinite number of templates in memory to account for human object recognition capabilities.
The problem of Invariance
A successful object recognition system must be able to recognize on object across different points of views (and other variability in context)
Template-Matching processes are used for a lot of applications where viewpoint can be controlled, and the number of to-be-identified “objects” is limited.
Computer-vision systems (self-driving cars) are increasingly sophisticated template-matching systems…but still template matching.
Template models with extensive image-normalization processes and exposure to massive image sets (for defining the templates) are working for increasingly complex and dynamic applications
Point-for-point matching works (logically) because the visual representation and the memorial representation are the same format and can be compared…point-for-point
Characteristics of Human Vision that are not Well Explained by Template Theories
Viewpoint invariance
Robust against image degradation
Memory representations cannot depend on sensory modality (templates do)
Recognition is vast and fast
Incredibly reliable…we don’t do this (as much)
Scene Processing → understanding objects and their relations in context
[Input] Image-based representations (e.g., luminance over space) → Perceptual Organization → [Output] Scene-based representations (e.g., surfaces and their spatial relations)
Components of Perceptual Organization:
Represent edges (image information)
Represent uniform regions bounded by edges (image information)
Border ownership/ figure vs. ground/ relative depths (beyond the image)
Completion- representing “inferred” parts of the scene (beyond the image)
Ambiguity and Best Guesses about Organization
It is more likely that two lines cross than that two angles happen to abut, but it’s not impossible that two angles abut.
So, perceptual inference based on likelihood is separate from cognitive inference.
Structural Description Models (alternative to template models)
Object representations are descriptions in terms of the nature of constituent parts and the spatial relations between those parts

Hands are represented as a specific set of parts and their spatial relations
Structural Description Models do three important things:
1. Provide efficiency of representation (like alphabet to words) allowing us to represent many distinct objects
2. Solve the comparison of representation problem (apples-to-apples instead of apples-to-oranges)
3. Solve the problem of viewpoint invariance by defining parts on the basis of viewpoint invariant properties (this needs more explanation)
Parsing Image into Parts
The structural description process has to unfold based on image formation
It cannot depend on knowing what the object parts are (4 fingers and a thumb)
This visual system uses matched concavities in the image to parse it (break it apart) and represent it as a set of component parts
Notice that parsing is perceptual organization
Why concavities?
When multiple 3D components join together, they often create concave boundaries in their 2D projection
Concavities in 2D images are therefore useful cues to 3D part boundaries
Parsing at concavities…Image-based process → does not depend on knowing what the parts are → allows us to establish structural descriptions of novel objects
Identifying the Parts
A relatively small set of parts provides efficiency of representation and recognition. Like letters in an alphabet.
26 letters
More than 1,000,000 words
Infinite number of sentences
What are the parts?
Recognition by Components (RBC). A specific structural description model
Parts are represented based on viewpoint invariant properties
Visual properties that remain constant in the 2D retinal image across (most) viewpoints of the 3D object
A solution to the challenge of viewpoint invariance in object recognition
3D curvature projects 2D curvature (except for a single accidental point of view). 3D straight projects 2D straight
So if the image is curved, the visual system infers that the object is curved. If the image is straight, the visual system infers that the object is straight
Cotermination (a viewpoint invariant property) on the image is perceived as co-termination in the world (even though sometimes it’s not)
The default assumption is that things are being viewed from a non-accidental viewpoint
2/24/26
Recognition by Components (RBC) → A specific structural description model
Objects are represented as sets of parts and their spatial relations
Addresses the perceptual ←→ memorial representation problem
Parts are defined based on viewpoint invariant properties
Addresses the challenge of viewpoint invariance


Each of these are examples of different parts that can be used to create different representations.

These different geons are reliably distinguishable from each other from different points of view.
Object representations consist of combinations of geons with specific spatial relations. (like words are combinations of letters in specific orders).

Cup → parts: {5, 3}. spatial relations: 5 is on the side of 3
Bucket → parts: {5, 3}. spatial relations: 5 is on top of 3
Structural descriptions (list of parts and their spatial relations) serves as a common representational format for perception and memory
apples to apples
Structural Description Models
do three important things
Provides efficiency of representation allowing us to represent many distinct objects (like alphabet to words)
Solve the comparison of representation problem (apples-to-apples instead of apples-to-oranges)
Solve the problem of viewpoint invariance by defining parts on the basis of viewpoint invariant properties (e.g., in RBC)
Facial Recognition
Visual Agnosia → can’t identify non-face objects but can recognize faces
Prosopagnosia → can identify non-face objects but can’t recognize faces
Structural descriptions of faces do not help identify between individuals
Thatcher Effect → Why does the altered one look so much weirder when it is right-side up?

Faces are processed (more) holistically (than non-face objects).
“Holistically” means that recognition depends more on the representation of relations between parts or configurations than on parts.

Whole-object advantage for detecting difference (Tanaka & Farah 1993)
only holds for faces. houses won't engage normal face-recognition processes, but faces will.
Upside-down faces should not engage normal face-recognition processes to the same extent that upright faces do
The whole-object advantage occurs only for upright faces - upside-down faces are not faces to the visual system
Inversion Effect → Evidence that faces are processed differently
Face recognition is impaired more by inversion than non-face object recognition is impaired.
Fusiform Face Area Parahippocampal Place Area
PPA → area that is selectively activated by images of places
FFA → area that is selectively activated by images of faces

Both areas are in the temporal cortex (ventral stream)
These areas reflect different types of processing (part-based versus holistic) rather than different categorical functionality (places versus faces)
The processing of upside-down faces is dominated by part-based processing
An inverted face is not treated by the visual system as a “face”… so “regular object” part-based processes dominate …each part is essentially fine here
The processing of right-side up faces is dominated by holistic processing
A right-side up face is processed as a face …so holistic processes dominate … the relations between parts are incongruous.
The “face” is detected when it is at the orientation of normal faces

Alignment supports the perceptual completion of the two halves into a single object. The single object is a face and therefore processed holistically …making the individual component faces difficult to represented separately. For this (albeit weird) task, holistic processing is a problem.
The difference in difficulty between aligned and misaligned stimuli should be significantly reduced because turning them upside down makes them less likely to engage holistic processing and it is holistic processing that is causing the greater difficulty for aligned faces
Are faces special?
Sort of. Experts tend to showed increased FFA activity when looking at examples of the category for which they are expert. (Fusiform Expertise Area)
Perceptual expertise often involves shifting from more part-based processing to holistic processing
Own-Race Effect is Real
It’s about experience
The inversion effect is greater for own-race faces than other-race faces
Differential experience leads to differential engagement of holistic processing (expertise)
Scene processing → understanding objects and their relations in context
We extract the “gist” of scenes extremely quickly (the “clap when you see water” example)
We do this by using global image (proximal stimulus) properties to coarsely categorize scenes (representation of distal stimulus) of different types.

All of this involves inference.
Depth and Size
The function of vision → is to establish internal representations of the external world, such that we can successfully interact with it.
Retinal images are ambiguous with regard to size and depth
Retinal images are measured in terms visual angle.
The same object projects a smaller retinal image at further distances.

Notice that retinal images are ambiguous with regard to shape too because of orientation in depth.


2/26/26
Perceiving size

Oculomotor cues
Accomodation
The depth cue is that the brain can register the state of the muscles that control lens thickness.
Convergence
The depth cue is that the brain can register the state of the muscles that control the angle of the eyes.
Cues based on retinal imagae aka stereovision.
Monocular vision, you don’t need information from both eyes. One is sufficient enough.
Static cues:
Position-Based Cues
Partial occlusion:
When one thing is infront of another and blocks the object behind it (occlusion) it sends a cue that the thing that is occluded is farther away than the thing that is not occluded.
Relative height:
Natural images have multiple cues that the system must integrate in some way.
Relative height in image. Depth information.
Size-Based Cues
Relative size

Familiar size

You are familiar with the size of the coins so you know their sizes are different but in this image they are the same size
Texture gradients
Linear perspective
Lighting-Based Cues
Atmospheric perspective
Shading
You can convince yourself of a different lighting direction, and it will change the depth/shape perception
Cast Shadows
Many aspects of cast shadows carry information about depth

Dynamic Cues
Motion parallax
The magnitude of speed difference between two objects is metrically related to the distance between them.
Optic flow
Is the change in the optic array over time (We use it a lot for guiding our action)
The dynamic (changing) optic array
Image that is projected to the retina
Optic flow is separate from object perception
Deletion and accretion
Cues can work together motion and cast shadows
Binocular cues
Binocular disparity
Corresponding points are defined relative to the fovea
Horopter: The set of locations in the world that project to the corresponding points. It defines a surface of zero disparity.
Only exists in the relationship between the images in the two eyes
Direcetion of disparity indicates direction of from the horopter
Uncrossed disparity: perceived as farther than horopter
Images move away from the fovea nasally (toward the nose)
You would have to uncross (diverge) your eyes to fixate the object
Crossed disparity: perceived as closer than the horopter
Images move away from the fovea temporally (toward the ear)
You would have to cross (converge) your eyes to fixate the object
Binocular disparity
The difference in the relative position of the image of a single object (or edge) on the two retinae
Stereopsis
Perceiving depth from binocular disparity
About 7% of the population are stereoblind
3/3/26
Binocular cue
Only exists in the relationship between the images in the two eyes
Binocular disparity
The difference in the relative position of the image of a single object (or edge) on the two retinae
Stereopsis
Perceiving depth from binocular disparity
Corresponding points are defined relative to the fovea
Horopter
The set of locations in the world that project to corresponding points on the two retinae. It defines a surface of zero disparity.
Direction of disparity indicates direction from the horopter.
Uncrossed disparity
Perceived as farther than horopter
Crossed disparity
Perceived as closer than the horopter
Binocular Disparity → Neurophysiology

The Correspondence Problem → How does the visual system “know” which image in the right eye corresponds to which item in the left eye?
Feature (color, shape, and image size) matches?
Image features are definitely used, but cannot be the whole story
The Wallpaper illusion (“magic eye”)
Occurs when the correspondence problem is solved “incorrectly”.

Inference-like process: In order for an object to project the same size image as another object from a greater distance, it must be a larger object.

Selective (Visual) Attention
Processing one source of visual information while ignoring others.
Compare identical stimulus conditions under different task conditions.
Visual attention is selective visual processing
Selection in space and time
What happens to selected information?
Neural basis of selective processing
Scene perception and the fate of the unattended
3/5/26
Attention can be captured
Initial eye-movements often go to the non-target additional singleton, but capture depends on control settings. If searching for a singleton is not the optimal strategy, singletons don’t capture attention.
Attentional guidance is determined by more than bottom-up (stimulus driven) and top-down (goal driven) factors.
Humans are social animals. Attention is guided by others’ (overt) attention.
Attentional guidance is understood in terms of an internal map that prioritizes locations for selection based on multiple sources of input.
Priority map integrates information based on salience (bottom-up), task relevance (top-down), and other attributes (e.g., search history and value).
Then the attention is guided to peaks in order of activation level (highest to lowest)

Selection in Time → Metaphor for understanding limitations of temporal selection.
Experience can improve temporal selection
Emotional stimuli (especially negative ones) capture our attention and induce an attentional blink
Enhanced activity (basically gain control)
Retinotopically organized enhanced activity in V1 corresponding to cued locations.
Cells with receptive fields at cued locations respond more strongly than cells with receptive fields at uncued locations.
Identical input yet different neural response under different cueing conditions.
You can see retinotopic response changes in visual cortex (V1) (attending to spatial locations).
Enhanced activity of specific types of processing
Recall that functional selectivity is an attribute of cortical processing.
Functionally-specific (objects not locations) changes in visual cortex.
Biased Competition → A theory of selection at a neural level
Stimuli compete for (neural) representation, and attention biases that competition in favor of one thing or another.
Attention changes (biases) population activity. MT/MST (human fMRI)
3/10/26
We have mechanisms to support selective processing because the processing capacity of the visual and cognitive system is limited.
Inattentional Blindness
Knowing that we are susceptible to missing things doesn’t prevent us from missing things.
This is because missing things is a consequence of selective processing… and selective processing is a necessary state of the system given limited processing capacity.

Global Image Information
Axes are defined by image features
Spatial frequency (openness) and edge orientation (expansion)
Scenes cluster by semantic type
Ensemble Perception (perceiving summary image-statistics)
We establish examples of summary statistics from natural images that people can reliably report.
Gaze direction, family resemblance, size, orientation, hue,
motion direction and speed, heading direction, face expression
Visual Attention
Visual processing has to be selective because processing capacity is limited
Attentional guidance is influenced by both bottom-up (stimulus-driven) and top-down (goal-based) factors, as well as by aspects of an individual’s own selection history → we understand guidance through the construct of a priority map.
Different lab-based tasks that are used to study selective processing (e.g., cueing, RSVP, visual search) have revealed what selected information is processed differently from unselected information
The fate of unattended stimuli is significant… we miss more information than we think we do
There are many ways in which selective processing is embodied at a neural level
Some aspects of scene perception are “unselective” and contribute to guidance of selective processing
Part 3
3/24/26
The study of color vision is a microcosm of vision science
The function of vision
To establish internal representations of the external world, such that we can successfully interact with it.
Reflectance and spectral reflectance
Lightness = perceived reflectance (psychological) → shades of gray
Is a (perceptual) conclusion
Color = perceived spectral reflectance
Surface Reflectance = the proportion of light that a surface reflects (physical)
The inverse optics problem for lightness
Luminance → intensity of incident light (physical) → intensity of light at the eye - retinal input (physical)
Reflectance → proportion of incident light that a surface reflects (physical)
It depends on the luminance at the eye and the perceived luminance of incident light
Simultaneous Contrast
Lateral inhibition explanation of simultaneous contrast: Higher luminance surround suppresses more than lower luminance surround
What matters is to which regions/surfaces the gray squares seem to belong - lightness depends on perceptual organization
More perceptual organization
Edge types are important cues about the illumination conditions
Different edges come from different things
Scene cues about edge type provide information about the illumination conditions
Lightness difference is stronger when the edge is perceived as an illumination edge than when it is perceived as a reflectance edge
The lightness difference increases as the cues about edge type lean increasingly toward them being illumination edges (rather than reflectance edges)
Inference about the structure of the scene
Color → perceived spectral reflectance
The visual system’s conclusion as to what proportion of light a surface reflects as a function of wavelength
The nature of specific light (from a source or at the eye) can be described in terms of its power spectrum intensity as a function of wavelength
White light (such as from the sun) is light that contains all wavelengths in more-or-less equal proportions (Flat power spectrum)
Flat Power Spectrum → describing white light
Measurements of intensity of specific wavelengths produced by different light sources (their power spectra)
Surfaces have different reflectance profiles
The proportion of light that a surface reflects as a function of wavelength
Spectral reflectance
When reflectance profiles are flat, we talk about lightness (perceived reflectance because it’s constant across wavelength)
When reflectance profiles are not flat, we talk about color (perceived spectral reflectance)
(Light at source): Power spectrum → intensity of incident light as a function of wavelength (physical)
(Surface reflectance): Spectral reflectance → proportion of incident light that a surface reflects as a function of wavelength (physical)
(Light at eye): Power spectrum → intensity of light at the eye as a function of wavelength at the eye - retinal input (physical)
Changes to the illuminant vs changes to the (surface) reflectance
Additive color mixing
mixing lights
changes the power
spectrum of the light
Subtractive color mixing
mixing pigments
changes the spectral
reflectance of the surface
Additive color mixing (lights) Red + Green = Yellow
Pigment absorbs more of the wavelengths, subtracting from the signal that reaches the eye
More wavelengths are added to the signal that reaches the eye
Additive - it relies on mixing of different wavelengths as they reflect off of different points of pigment
3/26/26
Light → Power Spectrum (Spectral Power Function)
Surface → Spectral Reflectance Function
Light (with a given power spectrum) shines on surfaces (with a given spectral reflectance function) → Describes the light (power spectrum) that reaches the eye.
First Step to Perceiving Color
Encoding the wavelength information at the eye
Spectral content of light & Spectral reflectance of surfaces both determine the light at the eye
Law of Three Primaries (psychophysics)
Given control over the intensities of three different primary light sources, any visible spectral color can be matched
Led to the hypothesis of 3 classes of photoreceptors, each with a different peak sensitivity.
Trichromacy (physiology)
Population coding → the pattern of activity across a population of cells.

Still ambiguity
Imagine a system with two classes of receptors with different spectral sensitivities
This system cannot tell the difference
Metamers
Pairs of stimuli that are perceptually identical but are physically different (have different power spectra)
530 + 680 light does not create 580 light
This system just can’t represent the difference between 530 + 680 and 580
The set of discriminable colors is determined by the number of cone types and their specific spectral sensitivities
Color blindness is an inability to discriminate between colors that “normal” (trichromatic) folks can because of fewer distinct cone classes
Color blind individuals simply have more metamers

Red-green colorblindness is often caused by the M and L cones having peak sensitivities that are too close
Blue-yellow colorblindness is much less prevalent
Mantis shrimp have 16 classes of photoreceptors!
Wavelength is physical
Color is psychological
Does not exist in the external world
Is the interaction between wavelength and our particular visual system
Color Space (versus spectrum)
electromagnetic spectrum is linear
380 vs 780 nm are maximally different stimuli
Color space is circular
Color space is a perceptual (physiological) space, not a physical space
The color spindle - three dimensions
The Law of Complementarity (psychophysics)
For any spectral color there is a complementary spectral color such that when the two are combined, the result is white/gray
Worked out by Hering around the same time of the Law of Three Primaries
Hering hypothesized 3 classes of photoreceptor, each with an opponency relationship
Trichromacy (neurophysiology) → photoreceptors
three cone types were confirmed definitively by the 1960s
later (1970s or so), color-opponent cells were discovered
Opponency → ganglion cells and beyond
Trichromacy at the level of photoreceptors and opponency in higher-order cells
Both types of physiol