1/88
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Why talk about eye movements in a course on perception?
Because it shows that perception is active! Not passive
Eye movements:
six muscles are attached to each eye and are arranged in three pairs:
Inferior/superior/lateral/medial rectus
Inferior/superior oblique
Eye muscles are controlled
by 3 cranial nerves
Cranial nerves start in the brainstem and are controlled by several other nuclei for horizontal and vertical eye movements
Superior colliculus:
Structure in midbrain that plays important role in initiating and guiding eye movements
Target for retina
Cerebral cortex: Frontal & parietal (etc.) eye fields
Vision input → superior colliciulus → controls eye movements
Six types of eye movements
Smooth pursuit
Saccade
Vergence eye movement
Fixational eye movements, microsaccades
2 more to keep the retinal image stable during (self-)motion
Smooth pursuit:
Eyes move smoothly to follow moving object
Ex . Eyes following a moving fly
Saccade:
Rapid movement of eyes that change fixation from one object or location to another
Ex When reading, your eyes don't smoothly track each letter. Instead, they make a series of quick jumps called saccades, pausing briefly on each word or phrase to allow for processing
Vergence eye movements:
Type of eye movement in which two eyes move in opposite directions
Ex. Making a silly face.
Function of smooth pursuit eye movements:
keep object of interest stable and on the fovea
Moving an object in front of a person and ask them to follow with eyes
Function of saccadic eye movements:
move (rotate) fovea to object of interest, move as quickly as possible to reduce travel time during which vision is blurred (because photoreceptors are slow). → like when reading you want high resolution
Yarbus (1967): scanpaths reveal intentions and interests.
3-4 saccades/sec
Thinner lines → when eye was moving quickly when analyzing face of the girl
Function of vergence movements:
looking at objects in depth so that retinal images are overlapping
Converging/ diverging movements
Stereovision
Done deliberately
Converging
Focus on Tree (Near Object)
The eyes are rotated inward, so they are converging.
This brings the nearby tree into focus.
Visual Perception:
The tree is clear and sharp.
The mountain is now blurry due to being outside the focal plane.
Neural Representation:
The tree is now centered on the fovea in both eyes.
Diverging
Focus on Mountain (Far Object)
The eyes are oriented outward, so they are diverging.
This allows both eyes to point toward the distant mountain.
Visual Perception:
The mountain appears sharp and in focus.
The trees (which are closer) appear blurry, because they are not on the focal plane.
Neural Representation:
The mountain's image falls on the fovea (the center of vision) in each eye.
The tree falls on a non-foveal part, leading to a blurrier image.
Why do we get seasick
On a boat, especially when below deck:
Your body senses the motion of waves (vestibular system).
Your eyes don’t see the movement, especially if you're not looking outside.
Disagreement between vestibular system and vision
This makes seasickness common in passengers who aren’t watching the horizon.
Why don’t we get seasick from eye movements
Spatial constancy
Spatial constancy:
the ability to perceive the world as stable and continuous despite eye movements.
Enables us to discriminate motion across the retina that is due to eye movements vs. object movements
Enables us to tell where things are
How do we perceive the world as stable?
Compensation theory:
Compensation theory:
Perceptual system receives information about the eye movement and discounts changes in retinal image that result from it
Motor system sends motor command to eye muscles
A copy of that command (“efference copy ”or ”corollary discharge”) goes to an area of visual system that has been dubbed “comparator”
Comparator compensates for image changes caused by the eye movement, inhibiting any attempts by other parts of the visual system to interpret changes as object motion
Corollary Discharge Pathway for Eye Movements
Frontal Eye Field (FEF) sends a motor signal to the eye muscles to initiate movement.
At the same time, it sends a corollary discharge signal (also called an efference copy) to a comparator in the brain.
The comparator receives:
The corollary discharge signal (predicting the eye movement).
The image movement signal from the visual cortex (actual sensory input).
This allows the brain to distinguish between:
Motion caused by the world (external changes).
Motion caused by your own eye movement (internal changes).
Helps you ignore the blur that results from moving your own eyes
But: compensation wouldn’t be precise enough.
Bayesian inference (e.g., Niemeier et al., 2003)
The brain achieves spatial constancy because it assumes a priori that the world is moving very little
Small movements in the world that coincide with saccades are ignored
How do we perceive the world as continuous? Why don’t we notice retinal smear during saccades?
Saccadic suppression
Saccadic suppression (of vision, including motion):
Reduction of visual sensitivity that occurs when one makes a saccadic eye movement; eliminates smear during an eye movement
0 is when eye starts moving, negative number is before eye movement, positive number is milliseconds after
Sensitivity went to the negative number when eye movements were occurring because they perceived motion in the opposite direction
Sensitivity to motion is really bad during eye movement. We want that because we don’t want to perceive that blur
I.e., there are times of “grey-out”
How do we perceive the world as continuous? Why don’t we short periods of blindness (“grey-out”) when we make a saccade?
Distorted time perception around the time of saccades
Euclidian geometry:
Parallel lines remain parallel as they are extended in space
Objects maintain the same size and shape as they move around in space
Which sense is governed by Euclidian geometry?
Touch because size doesn’t change
Problem for vision:
recover 3D info from 2D projections
Most depth cues can be derived from geometrical consequences of the projection
The two retinal images of a three-dimensional world are not the same!
Parallax
Binocular disparity:
The differences between the two retinal images of the same scene.
It is the basis of stereopsis; a vivid perception of the three- dimensionality of the world that is not available with monocular vision.
With both eyes (binocular vision), your brain fuses the two images and gives you a strong sense of depth and 3D structure—you can feel like objects are "popping out" or receding into space.
With one eye (monocular vision), you can still see some depth (using cues like size or perspective), but it’s not as vividor accurate.
Our retinas are 2D projection surfaces.
The brain creates a 3D image from the projections.
Monocular depth cues vs. Binocular depth cues:
One eye sufficient vs. two eyes necessary
Binocular depth cues (from overlapping visual fields) provide:
Convergence
Stereopsis
Ability of two eyes to see more of an object than one eye
What is convergence in binocular depth perception?
Convergence is the inward turning of the eyes when focusing on a nearby object.
The brain uses the angle of convergence to estimate distance.
Greater convergence = closer object.
What is stereopsis and how does it arise?
is the vivid perception of three-dimensional depth that arises from binocular disparity—the slight differences between the retinal images in each eye.
It is not available with monocular vision.
It's a key result of comparing images from both eyes
How does having two eyes improve object perception?
Two eyes can see more of an object than one eye because of their slightly different viewing angles.
This provides more complete visual information.
It also helps in detecting the shape, edges, and depth of objects better.
Occlusion:
A cue to relative depth order when, for example, one object obstructs the view of part of another object
Two types
Nonmetrical depth cue
Metrical depth cue
Nonmetrical depth cue:
provides information about depth order but not magnitude.
Metrical depth cues:
Provide quantitative information about distance
Monocular depth cue
Occlusion
Relative size
Position cue
Familiar size
Aerial perspective
Linear perspective
Motion cues
Relative Size:
A comparison of size between items without knowing the absolute size of either one
Monocular depth cue where we judge the distance of objects based on their apparent size relative to each other, even without knowing their actual size.
If two objects are known or assumed to be similar in size, the one that appears smaller is perceived as being farther away.
No need to know the real size—just compare them to each other.
Texture Gradient:
A monocular depth cue based on the geometric fact that items of the same size form smaller images when they are farther away
Relative Height:
Monocular depth cue
Objects at different distances from the viewer on the ground plane will form images at different heights in the retinal image
Euclid’s remoteness theorem
Euclid’s Remoteness Theorem states that more distant points on a surface below the eye (like a floor or ground plane) will appear higher in the visual image.
For example, segment BC (farther away) appears higher in the image than segment AB (closer), even though both lie on the same ground plane.
This principle helps the brain interpret depth and distance in flat 2D images using projection geometry.
Natural scene statistics
The visual system uses natural scene statistics—regularities found in the natural world—to interpret ambiguous visual input.
These expectations (like the ground being below us and receding into the distance) help guide depth perceptionand shape our assumptions about 3D space.
Scenes upside down look
Less deep
Familiar size:
depth cue based on knowledge of the typical size of objects
Absolute metrical depth cue vs. relative depth cues
What is the difference between absolute (metrical) and relative depth cues?
Absolute (Metrical) Depth Cues provide quantitative information about how far an object is (e.g., "2 meters away").
Example: Familiar size, motion parallax, convergence angle
Relative Depth Cues provide information about which objects are closer or farther, but not exact distances.
Example: Occlusion, relative size, linear perspective
Aerial perspective:
A depth cue that is based on the implicit understanding that light is scattered by the atmosphere
Reduction in contrast, saturation, hue ➔ cooler colours, blue
Example: Haze
Linear perspective:
A depth cue based on the fact that lines that are parallel in the three-dimensional world will appear to converge in a two-dimensional image
Vanishing point:
The apparent point at which parallel lines receding in depth converge
3-point perspective:
discovered after the invention of photo cameras.
Foreshortening
Refers to the visual effect that an object or distance appears shorter than it actually is because it is slanted toward (away from) the projection screen/retina/picture plane.
Raphael’s tricks
Linear Perspective (Yellow Oval)
The architecture uses linear perspective: all parallel lines converge toward a single vanishing point at the center of the image, behind Plato and Aristotle.
This draws your eye into the depth of the scene, enhancing 3D structure.
Relative Size (Red Arrow)
The philosopher under the red arrow appears smaller because he’s farther away.
Raphael uses relative size as a depth cue: people in the background are painted smaller to appear more distant.
Occlusion & Texture Gradient (Cyan Circles)
The figures in the foreground partially block those behind them (occlusion), indicating which people are closer.
You can also see a texture gradient: floor tiles and details become smaller and more compressed with distance.
Pictures are relatively robust to vantage point of the observer. But only to a certain point
Anamorphosis:
a distorted projection or perspective requiring the viewer to use special devices or occupy a specific vantage point to reconstitute the image.
Ex. The skull on the floor
Where monocular cues fail
Ames room
Here the depth cues are removed. The girl on the left is much further away, but the perspective cues are manipulated.
Only works for a single view point.
Motion cues: parallax in time
Motion parallax: the fact that objects moving at a constant speed across the retina will appear to move a greater amount/faster if they are closer to an observer
The stereokinetic effect (SKE)
is a visual illusion where rotating 2D patterns, like nested circles, create the perception of three-dimensional depth.
Most scenes have multiple cues
Texture gradient
Relative height
Aerial perspective
Linear perspective
Accommodation and vergence
help eyes perceive depth
Accommodation:
Eye changes its focus
Monocular
Convergence:
Binocular but not stereo
Ability of the two eyes to turn inward; reduces the disparity of a feature to (near) zero
Divergence:
Binocular but not stereo
Ability of the two eyes to turn outward; reduces the disparity of the feature to (near) zero
Triangulation
is the process by which the brain determines the distance to an object by comparing the angles from each eye to that object.
It relies on binocular vision and the separation between the eyes (called the interocular distance).
The brain uses the angle of convergence and the difference in image position (binocular disparity) to calculate depth.
Binocular disparity
Differences between the images falling on the two retinas due to parallax
Stereopsis:
“Popping out in depth”
Stereopsis is the perception of depth that arises from the brain combining the slightly different images from each eye (known as binocular disparity).
It gives a vivid, 3D sense of the world.
Stereopsis is only possible with binocular vision (both eyes open and aligned).
It's a key result of the brain using triangulation based on disparity between the retinal images.
Most humans are able to see this way
How exactly does this translation from stimulus attribute to perception take place?
Images on Bob’s 2 retinas.
Bob fixates red crayon: corresponding retinal points: points of retinal images that have the same distance from the fovea. “Zero binocular disparity”.
The same happens to be true for the blue crayon.
Horopter:
location of objects in space whose images lie on corresponding points. The surface of zero disparity
Diplopia:
double vision for points outside the horopter (actually: Panum’s fusion area).
Panum’s fusion area:
region of space in front and behind the horopter within which binocular single vision is possible.
what does zero disparity mean
"Disparity" means difference.
There is no difference between where the object you are looking at lands on your left eye and right eye.
It hits the same place (the fovea) in both eyes → so: zero disparity.
Disparity provides info about distance from horopter.
Crossed disparity
Uncrossed disparity
Crossed disparity
Image shifts outward on both retinas
Object is closer / in front horopter
Uncrossed disparity
Image shifts inward on both retinas
Object is farther / behind the horopter
Absolute disparity:
A difference in the actual retinal coordinates in the left & right eyes of the image of a feature in the visual scene
Relative disparity:
The difference in absolute disparities of two elements in the visual scene
What is the difference between absolute disparity and relative disparity in binocular vision?
Absolute disparity: Difference in retinal position between an object and the fixation point; indicates how far an object is from where you're looking.
Relative disparity: Difference between the absolute disparities of two objects; indicates depth between objects, regardless of fixation.
Free fusion:
The technique of converging (crossing) or diverging the eyes in order to view a stereogram without a stereoscope
Some people do not experience stereoscopic depth perception because they have stereoblindness
An inability to make use of binocular disparity as a depth cue
Can result from a childhood visual disorder, such as strabismus, in which the two eyes are misaligned
strabismus
in which the two eyes are misaligned
Person needs to wear an eyepatch because brain may get sick of double vision and shut the vision off in one of the eyes
Julesz:
random dot stereograms can only be seen with binocular cues; they contain no monocular depth cue
Evidence that disparity is sufficient for stereopsis. No need for cues from object perception
Correspondence problem:
Figuring out which bit of the image in the left eye should be matched with which bit in the right eye
Correspondence between two apples that actually are the same apple (easy).
Correspondence between pixels that are the same (hard!!!).
A few ways to solve the correspondence problem:
Blurring the image: Focusing on low-spatial frequency information
In early stages of matching features between the two eyes, it's easier to match large, simple shapes than detailed textures or noise.
Uniqueness constraint: A feature in the world will be represented exactly once in each retinal image (1 feature in one eye paired 1 feature in the other eye)
If a tree branch appears in both your left and right eye views, your brain assumes it's the same branch, not multiple copies.
Prevents the brain from mismatching one point in the left eye with multiple possible points in the right eye, which would cause confusion about depth.
Continuity constraint: Except at the edges of objects, neighboring points in the world lie at similar distances from the viewer
The brain assumes gradual changes in depth, not sudden jumps, when deciding how to match points across the two eyes.
Exception: At the edges of objects, there can be depth discontinuities (e.g., the edge of a table).
Helps reduce ambiguity—if point A is matched and point B is nearby, the brain can infer B’s match should be close in disparity, leading to smooth depth perception.
How is stereopsis implemented in the human brain?
Input from two eyes converges onto the same cell (V1 or later) ➔ neurons have RFs for both eyes
Many binocular neurons respond best when the retinal images are on corresponding points in the two retinas: Neural basis for the horopter
No disparity (aligned points) → "on the horopter" → certain binocular neurons fire best.
However, many other binocular neurons respond best when similar images occupy slightly different positions on the retinas of the two eyes (tuned to particular binocular disparity)
Binocular Rivalry:
The competition between the two eyes for control of visual perception, which is evident when completely different stimuli are presented to the two eyes
Bayesian approach:
A statistical model based on Reverend Thomas Bayes’ insight that prior knowledge could influence your estimates of the probability of a current event
Optimal inference from cues: perception should choose the solution depending on which one is most likely.
Very often perception comes close to what is optimally possible.
A is the most likely scenario in diagram
How does the visual system decide what you are actually seeing?
Which interpretation is most likely? (Basis of the Bayesian approach)
Familiar size cue + familiar shape cue: Prior knowledge
Specific distance tendency
When a simple object is presented in an otherwise dark environment, observers usually judge it to be at a distance of 2-4 m.
Equidistance tendency.
In a dark room, an object is usually judged to be at about the same distance from the observer as neighbouring objects.
Starry sky
Statistics of natural scenes
What happens when our guesses are wrong?
Illusions