Second Half Lectures
Lecture 10 - 10/16/24
Space
Object distance and size
Perceived Size α = size on retina X perceived distance
Size constancy
The image of an object on the retina gets smaller as the object gets further away
Vision relies on many cues to judge distance or depth
Given that distance, an estimate of size can be recovered
Emmert's Law
The apparent size of an afterimage is directly proportional to the perceived distance of the surface on which you see it
Size constancy illusions
Left line appears closer and right one appears farther because of implied distance to inner and outer corners when angles taken as perspectives cues (as they are on the right)
Automatic processing depth cues triggers inappropriate distance scaling perceived size on the left
Top line looks longer because of apparently greater distance
Automatic processing of depth cues triggers inappropriate distance scaling of perceived size on the left
Over application of distance on apparent size
Moon Illusion
Visual phenomenon where the moon appears significantly larger when it is near the horizon than when it is high in the sky, even though its actual size and distance from Earth remain the same
Apparent Distance Hypothesis
Relative Size Hypothesis
Atmospheric Effects
Quiz
Q. If people are asked to judge the size of a traffic light, their estimate is very bad. What does this tell us about size perception?
Size constancy does not exist for all objects, and thus it is not a useful cue.
The size of an object on our retina is the most reliable cue for computing the size of an object.
Perceived size cannot be a reliable cue if we do not have a good estimate of the distance.
Size perception relies heavily on contextual information, including distance cues. When we view a traffic light, without a clear sense of its distance from us, it becomes challenging for our visual system to accurately judge its true size. This emphasizes the role of distance cues in helping us interpret the size of objects correctly; without them, our size estimates tend to be inaccurate
We make less size judgment errors with objects that are above the ground compared to those that are on the ground.
Extrapolating from 2D retinal image
The brain's ability to interpret a flat, 2D image captured by the retina and perceive a 3D world from it.
The mind uses cues and prior knowledge to build a coherent understanding of the spatial relationships, depths, and distances of objects in our environment
The retina records a 2D projection of light and it lacks depth information
Several visual cues it uses:
Monocular depth cues
Information available from one eye, like perspective, texture gradient, and occlusion (one object blocking part of another)
Binocular cues
Arise from slight difference between images each eye sees (binocular disparity) and help provide depth perception
Motion parallax
Objects closer to us appear to move faster across our field of view than object that are further away
Prior knowledge and expectations
Brain uses stored information from previous experiences to fill in gaps and resolve ambiguities
Pictorial cues
Cues that support depth perception in flat, static images
Occlusion
Closer objects bloc farther objects
T-Junctions indicate which objects are in front of which others
But not by how much
Linear perspective
Lines parallel in the real world will appear to converge in 2D
Buildings with right angles and parallel lines
Geometry requires that parallel lines converge in the distance
Introduced into painting by romans
Height in field
Assuming a flat, level ground plane
Height in field corresponds to distance
Horizon is at eye level
Known size (object knowledge)
If object is familiar and has a typical size
Then reverse the relation and recover distance to the object
Useful cue but often overruled
Very Big Kid
Mistaken judgment of distance overrules knowledge of size of children
Texture gradients
Assume stuff on ground is uniform in size
Change in size must be due to change in distance
Doesn’t require level ground plane but this helps
Atmospheric perspective
Distance surfaces have less contrast because intervening atmosphere superimposes a haze
The farther the distance, the more the haze, the lower the contrast
Shadows
The effects of light - shadows, shading, highlights - may be discounted in order to recover the surface reflectance but they are not just thrown out
They also tell us about depth
Shadows tell us about the relative placement of objects
And about the relief of the surface on which they fall
If a shadow is in contact with the object that casts it, then the object rests on the shadowed surface
If there is a gap, the object floats over the surface
Distance between the object and the shadow it casts indicate the distance between the object and the shadowed surface
Quiz:
Which of the following pictorial cues is NOT metric?
Linear perspective
Relies on geometric rules (converging parallel lines)
Occlusion
Tells us which objects are in front of others but not by how much
Provides qualitative information, not exact measurements
Height in field
Assumes a flat ground plane, with height corresponding to distance
Atmospheric perspective
Uses contrast reduction due to haze as an indicator of distance
Why do these cues work?
These cues arise from depth differences between objects in the 3D world
Unlikely to arise by chance
They are highly informative and trustworthy
Not so in 2D pictures of course where our visual system is trickled into attributing 3D depth as the cause of these cues
How do these cues work?
Not laws or real rules that oblige a fixed interpretation
Not fixed rules
Depth cues don’t function as absolute laws and instead offer suggestions or probabilistic interpretations about spatial relationships
Ie:
Occlusion
Suggest relative depth
Linear perspective
Suggests that converging lines meet at a point in the distance but assumes parallel line sin 3D
Each cue “suggests” a small number of possibilities from most likely to least
Choose an interpretation compatible with greatest number of cues in local region
Even if there are too few cues to have much certainty
Sudoku puzzle of vision
Constraint satisfaction process
The brain “fills in” missing information about depth, distance, and object relationships by respecting the constraints provided by multiple visual cues
What if the inference is wrong?
Optical illusions occur
The brains assumptions based on typical cue interactions fail
Ie:
Ames room
Creates a distorted perception of size due to conflicting depth cues
Confetti illusion
Misinterprets color and shading due to contextual influences
Errors in this “puzzle-solving” are a natural consequence of the probabilistic and heuristic-based nature of perception
Special Point of View Effects
From one point of view, the cues suggest one from
But this is only seen from that one view
These cues may even suggest a form that is impossible
This demonstrates that these cues are interpreted locally
Ames Room:
Perspective cues to depth override known size
Beuchet chair
Exemplifies how visual perception prioritizes local consistency in interpreting cues over global spatial coherence
Underscored that depth and form perception rely heavily on the observer’s viewpoint and the alignment of visual information
An example of how perspective can override spatial inconsistencies
Quiz:
Recall the Ames room display in which two people stand on opposite sides of the room. One person appears smaller because he or she
Has a larger visual angle, so size constancy causes him to appear smaller
Has a smaller visual angle than the other person but appears to be at the same distance
Visual angle is smaller of the person in the far corner because they are farther away
Usually smaller visual angle would indicate greater distance, but since the brain incorrectly perceived them to be at the same distance as the nearer person, it interprets the smaller visual angle as the person being smaller
Has the same visual angle as the other person but is actually much further away
Perceived distance change but his visual angle does not
Depth Reversals
When a single 2d image or scene suppoirts two equally plausible 3d interpretations and the perception alternated between them
In previous examples, size and depth was misjudged
Here, each of these figures, there are two possible shapes that differ only in the sign of the depth
There is no misjudgement, both are consistent with the image
When the perceived object reverses, convex becomes concave and vice versa, front and back are exchanged
Shadow/Shading Reversals
Convex (bump) versus concave (dent) depends on where the light is coming from
Light assumed from above in room
Or above head? Turn head upside down
General Convex / Concave reversals
Maid of the mist draws close to bottom of niagara falls
Or about to be swept over the falls?
Mach’s folded Card
An illusion that highlights how depth reversals, when applied to real 3D objects, can create striking and unexpected perceptual effects, particularly when motion is introduced
Nothing else happens in flat images
But a depth reversal for a real object has truly surprising effects when it moves or you move
These reveal the elaborate construction underlying our perception of each object
Demo:
Note changes in surface material and shadow
Note following motion as you move your head
Action at a distance
Because the depth is reversed, the shape must appear to move in the same direction you do to explain or keep up with the changes in the view you now have of the object
Phenomenon that occurs during depth reversals in real 3D objects
The term captures how the perceived motion of the object adapts in a way that seems to defy normal; physical constraints
The depth reversal effect
When the perceived object depth of an object flips, the brain assumes a new 3D configurations for the object
The brain then adjusts how the object's position and movement are interpreted
Perceived motion of the object:
If you move left or right, the reversed object appears to shift its orientation or “follow” you, even tho it is stationary or moving independently
This occurs because the brain recalculates how the object must behave to match the new, reversed depth perception.
The perceived motion seems to act a a distance, beyond the object’s immediate physical influence
Constraints and sudoku puzzles
Cue:
A local, informative part of an image
Each cue gives rise to a set of possibilities
Each possibility constraints the assignment of surfaces abd edges around the cue
The final interpretation of the image is the one that is most compatible with “all” the cues
This is called constraint satisfaction and it is also the method for solving crossword puzzles and sudoku
In these puzzles:
The final answer must build from starting cues and satisfy all the rules
How many cues are checked before finding the best interpretation?
Usually only a few are enough to fiend a good solution
Evidence is that only a few are checked within a local region
Allows many different regions to be analyzed in parallel
Allows rapid determination of depth over whole scene
Cost:
Inconsistencies across regions not noticed
Lecture 11 - 10/21/24
One Minute Quiz
Perceived Size α = size on retina X perceived distance
Q: Size constancy?
The image of an object on the retina gets smaller as the object gets farther away
Vision relies on many cues to judge distance or depth
And given that distance, an estimate of size can be recovered
Constraints and sudoku puzzles
How many cues are checked before finding the best interpretation?
Usually, only a few are enough to find a good solution
Evidence is that only a few are checked within a local region
Allows many different regions to be analyzed in parallel
Allows rapid determination of depth over the whole scene
Cost: inconsistencies across regions not notices
Binocular Vision
Accommodation
Tis the process that only works for the short distance
Bring the target into your focus
The lens is stretched or relaxed
The visual system can sense how your lens are modulated through muscles
As your lens relaxes and stretches, it will measure the distance
Translates this into a distance for the object
Problem: can only work for a short distance
Emmetropia:
Perfect
Myopia:
Nearsightedness
Focal point in front of the retina
Has a fixation focal point before the retina
Can see near but not far
Hyperopia:
Far sightedness
Can see far away, not near
The lens are correcting this problem and making sure your focal point isn’t hitting the retina correctly
Convergence
Only works in short distance
Not much use beyond 1m
Both eyes move inwards or outwards
Angle of convergence (large)
Close objects
Angle of convergence (small)
Far objects
Primarily good thing for near events
Motion Parallax
As we move, we create relative motion of the objects around us
Distance objects move last, close objects move most
imagine you are driving, the objects far away move less, but objects closer move the most.
Closer objects move more quickly than objects far away
If the object is moving faster, it is closer
You feel like the abject moving closer to you is moving the other way
This relative motion is seen as depth
But we can see depth between two objects without moving our heads (parallax) and without moving our eyes back and forth between them (convergence and accommodation)
Stereoscopic vision
Depth information from binocular disparity
Found in many predatory and/or arboreal species
Daily life:
Important cue for estimating close distances
Geometry of binocular disparity
Corresponding retinal points
Same position on each retina with respect to the point of fixation (fovea)\
Horopter
Locus of all points in 3d space that fall on corresponding retinal points
Vieth-muller circle
Zero disparity (single image)
You have single images on the focal point
What about objects off the horopter?
Fall on different retinal points
Ie: finger test
Binocular Disparity
A tiny difference
Perceived depth increases with disparity
Crossed or uncrossed
Fixating on a:
A falls on corresponding points in the two retinas: zero disparity
B falls in different points: has disparity
Disparity = Өr - Өl
As perceived depth increases, disparity increases
Stereograms
Stereoscope
What they see, its not a 3d image
Its a 2d image with forced depth perception
Overlaying two images so the brain thinks there is depth
3D glasses
Free fusion w/o optical aids
Stereo methods
O view the 3d image you must converge your eyes at a different plane of depths from the picture
This superimposes the picture on itself with an offset
Free fusion
Uncrossed (parallel fusion:
Focus behind actual image
Crossed fusion:
Focus in front of actual image
Autostereograms (“magic eye”)
Single image with repeating 2d patterns
Generally use uncrossed fusion
Stereopsis: depth from binocular disparity
How does it work?
In depth perception, we skip shape analysis and you combine the left and right image
Bela Julesz
Random dot stereograms
When you can still perceive the depth, it must be shape analysis
Otherwise featureless random black and white pixels which are essentially the same texture in each eye
Some dots, however, are shifted laterally with respect to the others
If fixating on background dots, they fall on corresponding points in both eyes
But the dots of displaced square do not
They have a disparity in the two eyes views
Therefore, the square is seen in depth
Puzzle of random dot stereogram
How to match up images in left and right eyes?
Correspondence problem
Suggests stereopsis occurs at a coarser scale
Kaufman & Pitablado (1965): letter stereograms
Individual elements are not identical
Yet stereoscopic depth is perceived
Physiology of disparity
-10% of population
Amblyopia (lazy eye)
Blurred non foveal image suppressed by visual system
Often results from strabismus (misalignment of eyes)
Binocular critical period
Best treated before age 5
Summary of physiology
Binocular cells have receptive fields in each eye
Range of different separations
Stereoblindness
If you see the number 1, you have depth perception
Problems learning to read
3d space is compressed into a 2d image
3-4 months of age is when people develop a sense of 3d space
“Fixing My Gaze” book
Binocular Rivalry
When input to two eyes is completely different
Competition between eyes
Best guess about the world given inconsistent retinal images
During binocular rivalry all/part o one image appears totally suppressed from consciousness
Lecture 12 - 10/23/24
Motion
Directly measured not inferred
Evidence for “motion detectors”
Motion after effects
Kinematograms, motion then shape
The waterfall illusion
First described by Aristotle ca ~350 BC
Variant of the motion aftereffect (MAE)
Motion Aftereffect
Motion is experienced on the test even though the pattern is not seen to move anywhere
Motion cannot be only an inference based on noticing change in location
Motion Aftereffect (MAE)
Reveals the properties of motion detectors in the visual cortex
Kinematogram
Perceive shape defined by motion
Therefore motion cannot depend only on first seeing shape and then tracking it to infer motion
If motion is directly measured? Or inferred from position change of object?
Motion Aftereffects
Saw motion when there was no change in position of object
Kinematogram
Saw motion when there was no noticeable object
Directly measured by what?
Physiology of “motion detectors”
Reichardt Detector
Resposonse of a directionally selective cell, striate cortex
Consequences of motion detectors
Cant distinguish continuous motion vs discrete jump over tim and space within a short-range
“Apparent motion”
Perceived smooth motion from rapidly alternating stationary targets
Underlies many “motion” percepts
In movies (fps)
Correspondence Problem
Which way are the dots pairs moving?
Areas in cortex specialized for motion: MT (V5), MST
MT (V5)
All cells respond to motion
Many respond to “global” motion, independently of local directions
Large receptive fields
MST (V5a)
Most respond to motion
Very Large fields, can extend into both hemifields
Like expansion, shift, rotation, contraction
What is motion good for?
Recover the 3D dimensions
Aka structure from motion (kinetic depth effect)
Parallax: motion of our head or body reveals depth because closer objects move faster
Summoning attention
Motion captures your attention
Form from motion
Extract form from motion
As soon as the shapes start moving, you can separate the shapes
Break camouflage, segment objects from background
Motion blindness
Akinetopsia
Motion Blind Patient LM - Ellen
Damage in MT Area
Sees motion as still images
Can’t tell when to stop filling a cup
Can't read facial expressions
Moving objects don’t attract her attention
Motion Blindness in normals
Strobe environment demo
Trouble catching balls
Holding posture
Judging relative location of moving and steady objects
Aperture problem
The direction of motion of a straight line is ambiguous
The visible displacement of the line can arise from an infinite set of possible physical motions
What does this have to do with motion detectors?
Receptive fields act as small windows hiding the end points
How to resolve this ambiguity?
One solution is to rely on local 2D features that doesn’t rely on 3D
Line endings and corners
End stopped cells respond to line endings and corners
Motion measured within small local receptive fields is often different from actual (global) motion of object
But some parts of object give unambiguous cues to direction
Barber Pole Illusion
Whose terminator is it?
Does the line ending belong to the line?
End stop to the rescue
Terminator motion disambiguates line or edge motion
End-stopped V1 neurons respond selectively to the endpoints of contours
The direction of the end of the line is more important than the orientation of the line
Effectiveness of terminator depends on who it belongs to
Summary
Directly measured not shape based
Motion detector subunits offset in space and time
Motion useful for seeing shape, depth, drawing attention, breaking camouflage
Damage to MT causes motion blindness
Aperture problem, local motion differs from global motion
Lecture 13 - 10/28/24
One Minute Quiz:
Stereoscopic Vision
Out ability to see depth comes from binocular disparity
Physiological Mechanism of Stereopsis
Binocular cells have receptive fields in each eye
These come with a range of different separations between the receptive field centers (relative to the fovea of each eye)
These cells are selective for disparity
The separation defines the preferred disparity for that cell
Binocular cells have receptive fields in each eye
These come with a range of different separations between the receptive field centers (relative to the fovea of each eye)
These cells are selective for disparity
The separation defines the preferred disparity for that cell
Color I
Perceiving Color
Why is it important?
Ex: Finding Fruit, in Birds – finding mates
Photoreceptor Sensitivities
Determined
X axis is wavelength
Blue – High energies
“S” cones
Short wavelength spectrums
Mostly blue cones
Respond to the blue cool spectrum
Humans:
Smallest number of cones
Red – Slow Energies
“L” cones
Long wavelength cones
Mostly Red cones
Responds to the warm red spectrum
Green - Mid Wave Length
“M” cones
Mid wave length
Centered around greenish-yellow cones
Y axis
Why 3 cones?
Principle of Univariance
1 receptor: same response to certain wavelength
If we had 1 cone, it would be hard to distinguish the color – we don’t have the sensation of the color and we mostly see light and dark
By combining the 3 different type of receptors we can perceive the color
We have different sensitivities income distributions
Not all colors look the same
Limitations
Perceived color varies with ratio of responses of the three cones
There are infinite set of wavelengths, many combinations
But, we only have 3 cones, or 3 values, one for each cone
Non-invertible: can’t recover the wavelength distribution
It is impossible to recover the original wavelength spectrum
Metamer: different spectra, same color
Natural Light vs LED
Same sensation of the color, but different spectra
“This is a unique term”
Invertible Code:
If a signal, say, the price of a care in dollars, is converted into a new code, say, the price of euros, the new code is invertible if we can recover the original value ($$) from the new value (Euros)
Metamer:
Two lights that have same perceived color but different spectra
Color Coding: trichromacy vs. opponency
Trichromatic Theory
3 primaries are enough
Population coding
Three photoreceptors primaries
L,M, and S Cones
Thomas young’s Theory (1802)
It is impossible to conceive each sensitive point of the retina to contain an infinite number of particles, each capable of vibrating in perfect unison with every possible undulation, it becomes necessary to suppose the number limited, for instance to the three principal colors, red, green[yellow] and blue
Consequences of Trichromacy
Infinite # of spectra can activate cones similarly
Metamers: different spectra that are perceptually indistinguishable
Evidence for Trichromacy
Color Matching Experiment
Two primaries are not enough, 4 are too many
Direct recording from photoreceptors in retina
Single receptor drawn into microelectrode
Measure stimulation by light beam
Opponent process theory
4 primaries + B/W
Opponent organization
Opponent coding from retina to brian
Retinal ganglion cells and LGN
Hering’s Opponent process theory (1892)
Retinal ganglion cells (“LGN 2”)
There are a group of cells that have the opponent component characteristics
If they are more excited by greenish and less excited by reddish, they will be more excited when there more green and vise versa
Chromatically opponent but not spatially
Non-opponent: spatially opponent but not chromatically
Same cells carry color and luminance information spatial organization of color different from luminance
Evidence for color opponency
Unique Hues
Certain color combinations don’t exist
We have reddish-orange, blue-green
But no red-green or yellowish-blue
Hue Cancellation
Adjust blue light to cancel out yellow
Blue-yellow seen as white
No blue-yellow mix
Adjust red light to cancel out green
Red-green combination seen as yellow
No reddish-green mix
Negative afterimages
Boundaries create stronger afterimages
Seeing opposite colors
The cell is tired
Physiological evidence: Opponency
Two-stage Model of color coding
Three photoreceptors primaries
L,M, and S cones
Opponent coding retina to brain
Retinal ganglion cells & LGN
How do we build opponency
Combinations of L,M, and S
Building red- Green opponency
Interplay of excitation and inhibition
L-M or M-L
Building blue-yellow opponency
Interplay of excitation and inhibition
S-(M+L) or (L+M)-S
Correct answer: C
Color Mixing: additive vs subtractive
Huge range of possible wavelength combinations
How can we get a particular color?
Color mixing
Additive for lights
Tells us about response of visual system
Subtractive for points
Tells us about physics of the stimulus\
Subtractive color mixing (CMYK)
Start with white light
Combination of pigments
Light subtracted from pigments
Red + green = blue
Printing, optical filters
Subtractive Mixing
Blue absorbs red, reflects some green and lots of blue
Yellow absorbs red and blue, reflects some green and lots of yellow
Mix the two and only green survives
Another approach to color mixing
Newton’s prism experiments
Color as combination of light,s not pigments
Additive color mixing (RGB)
Start without light
Combination of lights
Combination color depends on cone properties
3 primaries sufficient
RGB add together to produce every possible color
Red + green = yellow
TV/Computer monitors
Using Subtractive color, additively
Pointillists, like color TV phosphor, use additive colors when their spots of paint did not overlap
In contrast, typical painting use subtractive color
Summary
Perceiving color: 3 cones, 3 dimensions, metamer
Coding color
Trichromacy + opponency
Opponent pathways
Non-opponent
Color Mixing:
Additive, 3 dimensions
Subtractive, complex
Bring Favorite item with color to next class
Lecture 14 - 10/30/24
One minute quiz
Opponent Process Theory
Evidence
Unique hues
Hue cancellation
Negative afterimages
Physiological evidence LGN
4 primaries + B/W
Opponent organization
Color Lecture 1
Color constancy & contexts
Surface properties and Illumination
Light from surface = illumination x reflectance
Color constancy
Even if illumination changes, we still perceive the same color
Perceived color largely unaffected by illumination
Color constancy & illumination
One more example of a more general problem
Light from surface = illumination x reflectance
Is surface color due to light or paint?
Assumptions about shadows
Shadow darkens surfaces without changing colors
Luminance change, but not hue change
Shadows have fuzzy edges
Signals “light”, not “paint”
Color Contrasts
Surrounding context may affect how you perceive color - it does matter
Color assimilation
The color changes based on how it is interlaced
Color blindness
Neural basis of color deficiency
3 types of cone receptors
Maximal sensitivity to different wavelengths of light
Short (S) wavelengths - Blue
Medium (M) wavelengths - Green
Long(L) wavelengths - Red
Lose R or G cones (1/20 males. 1/400 females)
Lose b cones (very rare)
Lose two cones (cone monochromatic, rare)
Lose all cones (rod monochromatic, rare)
Abnormal or missing cones
Most often the M or L type (“reg-green” color deficiency
Lose one kind of cone and your vision now only two dimensions
What does it look like?
Testing for color vision deficiency
Ishihara color test: pseudoisochromatic plate
What's the number?
Color is the only cue
Specific profiles for different color deficiencies
No other visual cues to help distinguish numbers
Evolution of color receptors
3rd chromosome for rods
7th chromosome for b cones
X chromosome for r and g
Most recent mutation
Unstable
Distribution of color blindness
8.0% of european males
Lower percentages elsewhere’
0.0% of old world primates
Genetics of red-green color blindness
If mother is a carrier
No daughters are color blind
Half of daughters are carriers
Half of sons are color blind
If Father is color blind and mother is carrier
Half of daughters are color blind
If mother is color blinds
All sons are color blind
Island of colorblindness
In the pacific atoll of pingelap
Achromatopsia is caused by a genetic mutation
Complete or partial abscess of cones
Vision that is dominated by rods instead of cones
Poor visual acuity and sunlight is very painful
Cortical achromatopsia
Damage to area V8 (very rare)
Cerebral achromatopsia
Despite normal cone function
Lose subjective experience of color
World is like a black and white movie
Color lost but not shape or motion based on color
Demo - achromatopsia
How?
Using low-pressure sodium lamp
Emit only at 577 nm so all surfaces with reflected light will have only that once wavelengths
The ratio of cone responses will be identical for every surface and they will all look the same color
They will vary only in total response, that is, in luminance but not color
Perceiving the world through color alone
Lose spatial acuity, motion, and depth more that pattern
Motion slows down
Lecture 15 - 11/4/24
Exam 2 Review Location and Time TBD
Pop quiz:
The way a television generates the color images is based on:
Additive color mixture
Color contrasty
Simultaneous color contrast
Subtractive color mixture
Ecological Perception and Action
Overview
Optic Flow
Gibson et al. claim that optical flow:
Specifies exact direction of travel (heading)
Specifies distance of surfaces
Suppplice information for postural control
Indicates exact time of contact
Focus of expansion (FOE):
Central point from which motion seems to emanate
Only stationary point in optic
SPECIFIES EXACT DIRECTION OF TRAVEL
Gibson’s analysis of flying during World War II
How do pilots make successful high speed andings?
The use optic flow
You don't have to remember, you don't have to put any cognitive representation in memory. All the information is outside and the optic flow tells you have to behave/operate
Optic Flow as a motion cue
Lac of optic flow signals you are stationary
Optic flow can trigger perception of self motion
Vection (self-motion illusions)
If you are stationary outside, and optic flow is moving clockwise, you will feel like you are moving counter clockwise
Vection Illusion
Optic flow in periphery overrides vestibular input
Dominance of vision over vestibular information
Perception of self motion
Visual motion control os posture
“Swinging room” experiment
Stationary floor, but moveable walls and ceiling
Lee & Aronson (1974)
The floor is not moving, but the are moving the wall and ceiling moving
If you are not using optic flow, nd rely on vestibular cues, you would know that you are not moving
For infants
Visual information is more important
Adults
Less sensitive than infants
When standing on a narrow beam
When the room moves slightly, they will lose their balance
Moving toward vs moving away
Even more evident in children
Quiz:
Answer: A
Why:
Optic Flow from forward motion of room
Interpreted as backward sway
Compensate by swaying forward
You wanted to compensate your moving backwards, so you stabilize your posture by moving forward
Time to contact (Time to collision)
How do you know when something is approaching?
Scientist propose that we know when objects are approaching observer by (tau)
Time to collision (tau) = S/V
Time to collision can be calculated without knowing distance or velocity → just because you know
Tau = 100 / percent change of visual angle per second
Can measure visual angle and rate of change
But size distance and speed of object unknown
Compute toe to collision (tau)
Only need tot know change of the visual angle per second
Expanding patterns
Larger arrows mean faster expansion
Sec = unit of time
Quiz:
Answer:
20 months
Why:
100 / ( % change in visual angle / months)
Possible uses for TAU
Heading for a soccer ball
Streamlining during a dive
Quiz:
A gannet is plunged directly down toward the ocean hoping to catch a fish. The gannet should ensure that the fish is lined up with
The gannet’s line of sight
The optic array
The focus of expansion
The direction of the ocean waves
Answer:
C – the Focus of Expansion
Why:
Planes use the same thing when to figure to land
The focus of expansion does not move or change, it is the point that you are moving towards
Maintaining collision path
How do you catch a fly ball?
Run so that ball looks like its moving in a straight line
“Linear optical tracking”
Eep at fixed line of sight
Symmetrical looming
Problem with catching a ball
About 100 msec between retin and response of higher visual area
Ball is moving at 100 mph
In the time the brain can register the balls location, it has moved 15 ft
For moving objects, visual system acts as if it extrapolates where it will be
Predicts the present
Only for moving objects
Doesn’t work for flash objects
Flash lag effect
Moving objects seen ahead of flashed bar
Moving object extrapolated forward to expected location
Visual system seems to predict where moving object will be
“Predicting the present”
Only for moving objects
Not for flashed images
“Flash-lag” effect
https://upload.wikimedia.org/wikipedia/commons/6/60/Flash_lag.gif
Blindsight
Visually guided actions can occur without conscious vision
Blindsight after lesion of the striate cortex
Patient reports no awareness of target
But when asked to point or grasp target, does so
They cannot see the object, they can point or grasp the target
Cant tell you what it is, or why they moved
Separate visual input to the where/action pathway bypassing LGN and striate cortex
A working hypothesis
We see what we see and we know what we are seeing
No input to striate cortex and awareness
Locomotion and visual development
Role of locomotion in human depth perception?
Eleanor Gibson (1960): Visual cliff
Glass gives illusion of depth
Measure willingness to go to “deep” side
The Visual Cliff
Avoidance of visual cliff
Tied to locomotion
Early in locomotion: no avoidance
More crawling experience: refusal to “deep” side
Depth perception emerges from interaction with world
Testing a causal role of locomotion
Hein Held (1963): Kitten carousel
“Active” kitten pulled “passive” kitten around enclosure
Same visual stimulation, but only one moved actively
Active kitten showed normal visual development
Passive kitten never developed depth perception
Summary:
Optic Flow
Optic flow offers rich source of information
Optic flow as a motion cue
Visual motion control of posture
Impressions of self motion
Can be used to determine time to contact
Flash Lag effect
Moving object extrapolated forward to expected location?
Blindsight
Visually guided actions can occur without conscious vision
There is another vision pathway
Role of locomotion in human depth perception
Lecture 16 - 11/6/24
Development & Social Perception
Two fundamental aspects of emotional expressions:
Producing facial expressions
Emerge Early in a predictable Sequence
Basic reactions
As early as birth
Simple emotions
2-3 months
Social emotions
1 year
Complex emotions
3 years
Many expressions appear before social learning is possible
Nature vs Nurture: Evidence from blind individuals
Blind individuals show same victory/pride expressions as sighted individuals
Many expressions are innate, not learned through observation
Cross Species Expressions
Emotional expressions across primates suggest evolutionary origins of expressions
Perceiving Facial Expressions
Early Development of emotion recognition
Social smiling
1-3 months
Respond to other’s expressions
2-4 months
Universal Emotions: Paul Ekman’s Research
Emotions are found across all cultures, including isolated societies
Cultural Universality: The Fore STudy
Fore tribe in Papua New Guinea is an isolated society with minimal outside contract
Could accurately identify basic emotions, suggesting that emotional expressions are not culturally learned; instead they’re universal
Face Orientation and EMotional Recognition
We are better at processing information in upright faces
Summary:
Production of expressions:
Emerges very early in development
Present in congenitally blind individuals
Shared across species
Perception of Expressions
Develops in early infancy
Universal across cultures
Specialized processing for upright faces
The Physiology of Emotion Perception
Speed of emotional processing
The brain processes emotions remarkable fast
Emotional valence (positive/negative) detected in just 15ms
Faster than conscious awareness
Equivalent to a single video frame
Why fast?
Threat detection is evolutionary advantageous
Neural Pathway: A Dual Route
Standard Visual Route
Through visual cortex
Detailed processing
Conscious awareness
“Quick Route”
Direct pathway to amygdala
Bypasses detailed processing
Enables rapid emotional responses
Evolutionary older pathway
The amygdala
Triggert’s body’s response to perceived threats and plays a key role in processing emotions, particularly fear
Research spotlight: Vuilleumier et al (2003)
Strong amygdala response to brief, blurry fearful faces (low SF)
Menial activation in face recognition areas (FFA)
Supports existence of “quick route”
Asymmetry in EMotional Processing
Fixate the nose of each face in turn
Which appears happier?
The left side of faces carries more emotional weight
Why?
Right hemisphere left visual field input and the right amygdala is more sensitive to emotional content
Quiz:
Which is not evidence of a quick route for emotional processing
We can process emotional content even with unclear visual stimuli
The amygdala strongly responses to brief, blurry fearful faces that barely activated the fusiform face area (FFA)
Recognition of emotional expressions is strongly dependent on face orientation
Emotional responses can occur before conscious awareness of the stimulus
Summary:
Ultra-rapid emotion processing (15ms)
Dual route system
Standard visual pathway
Direct amygdala pathway
Right hemisphere advantage\
Mirror Neurons: Perceiving Other Minds
Social perception beyond visual processing
We don’t just see actions, we understand intentions
Early development of social understanding
Infants prefer helpers over hinderer; suggests innate social perception system
What are Mirror Neurons?
Neurons that fire when:
We perform an action
We see others perform an action
Mirror neuron system in action
Mirroring creates an internal simulation of other’s actions
Seeing someone smile → activate sour own smile muscles
Watching someone reach → activates our reaching neurons
The simulation theory
How mirror neurons help us understand others:
Observer action/emotion
Activate corresponding neural circuits
Simulate experience internally
Understand other’s intentions/feelings
Example:
We don’t just see a curved mouth, we see a happy person
Gestures, Gaze, and Congruence in Social Communication
Non-Verbal Communication
Three Key Components
Gaze Patterns
Gestures
Behavioral Congruence
Can be:
Conscious or unconscious
Cultural or universal
Learned or innate
Conscious, meaningful gestures are often culture specific
Ie: Indian Nod
Unconscious gestures are universal across cultures
Defensive and victory poses
Present in blind individuals, so occurs without learning or sight
Basic Properties of Gaze
Head and gaze direction indicates focus of attention
Sclera (eye whites) help with gaze detection
Crucial for social interaction
Shared Attention
The ability to follow and share attention with others by tracking their gaze
Developmental milestone
Present as early as 1 month
Critical for social learning and cognitive development
Research evidences
Infants preferentially look in the direction others are gazing
Experiment:
In the following experiment:
Infants looked at the display of the face and then saw two rectangles
Experimenters observed how often the infants looked at left or right rectangle
Even at 1 month, infants showed shared attention
They looked first and longer where the face had just looked
Eye contact in Social interaction
Extraordinarily sensitive to the direction of people’s gazes
Eye contact initiates social engagement
Reveals interests, scrutiny, dominance, and deception
Congruence in Social Interaction
Reciprocal behaviors in social settings
Yawning
Laughing
Unconscious reproduction of postures and gestures
The rocking Chair Study
Unintentional synchronization
People sitting in rocking chairs natural match each other’s movements without trying
Visual attention matters
The more directly people can see each other, the stronger the synchronization becomes
Gait and Gender
Takeaway:
Male and female gaits differ significantly
We can judge judge based on gait alone
Role of culture in perception
Pictorial depth cue
Hudson (1960) study: Schooled Children vs unschooled tribal children
Is the spear closer to the elephant or antelope?
Tribal children had difficulty interpreting depth cues in 2D pictures
Carpentered World Hypothesis
Our environment shapes our perception
Modern environments have many straight lines and edges
African zulus were less affected by Muller-Lyer illusion
Suggest reduced susceptibility due to less exposure to “carpentered” environments
Size Constancy Study
Size Constancy
Size of object is constant under different conditions
Pygmy living in dense forest had difficulty perceiving size/distance of objects in open spaces
Example:
A pygmy youth mistook distance buffalo for inspects
Western vs Eastern Cultures
Analytic thought (Western)
Origin: Greek Philosophy
Determinants:
Relatively mild climates
Economy does not require strong social ties
World view:
Things exist by themselves and can be defined by their attributes (context independent, object-oriented)
These patterns are observed in Euro-American Societies
Holistic Thought (Eastern)
Origin: East Asian Philosophy (Taoism, Buddhism, East Asian Animism)
Determinants
Frequently changing climate
Economy requires strong social ties
Worldview:
Things are interrelated
Various facts are involves in an event (context dependant, context sensitive)
These patterns are observed in Chinese, Japanese, and Korean Cultures
Self Concepts
Independent View
In north america, people tend to conceptualize “the self” as an entity detached from others and its context
Interdependent View
In east Asia, people tend to conceptualize “the self” as a relational and contextual existence
Attention to Object vs Field
Western Participants
Focused on focal objects
Better object recognition when background changes
Eastern Participants
Focused on background/context
Worse object recognition when background changes
Change blindness across cultures
Results:
Japanese participants detected more changes in contextual/background elements, while americans detected more changes in focal objects
Point of View (Picture Taking Task)
Picture produced by American Participants
Picture Produced by Japanese Participants
Results
Americans take photos with larger face-to-frame rations (close up) while japanese take photos with more background and smaller face-to-frame ratios
Own-Race Bias
Face Recognition Bias
Own Race Bias
People have more difficulty differentiating and remembering faces of another race compared to faces of their own race
Explained by Contact Hypothesis
Frequencies of exposure to different racial groups affects recognition ability
Sometimes called the “cross-race effect” or “other-race effect”
Summary:
Facial Expressions
Developmental sequence
Universal & Innate Emotions
Emotional Physiology
Rapid Processing
Dual Route System
Right Hemisphere Bias
Mirror Neurons
Internal Simulation
Gesture, Gaze, & Congruence
Gait & Gender
Cultural Perception
Depth Perception variations
Carpentered World Hypothesis
Size consistency
East vs West
Object vs context focus
Differing viewing styles
Own Race Bias
Better own-race recognition
Lecture 17 - 11/11/24
Selection in Space
Cueing Experiment
Posner Cueing Experiment
Peripheral Cue:
Validly cued trial because the target X was on the same side as the cue
Exogenous cueing / involuntary Attention
Symbolic cue:
Invalidly cued trail because the target X was on the opposite side
The red cue captured our attention and made us focus on the red square side, but when the X was presented on the other side, we had to shift our focus to the other side
Endogenous Cueing / Voluntary Attention
Both cue types can facilitate performance for stimuli presented at the cued location
Ie: faster reaction time for validly cued targets compared with invalidly cued targets
Metaphors for Selective Attention
“Spotlight” model:
Attention is restricted in space and moves from one point to the next
Areas within the spotlight received extra processing
“Zoom Lens” model:
The attended region can grow or shrink depending on the size of the area to be processed
Dichotic Listening Experiments & Theories of Selective Attention
Supports early selection model
Dichotic Listening Experiment
Different messages are presented to the two ears
Pay attention to the message presented to one ear (attended dmessage)
Repeat the attended message out loud (shadowing)
Ignore the message presented to the other ear (unattended message)
Ignored message does not reach awareness
Led to early selection model of attention
Also called a bottleneck because the filter restricts information flow
The cocktail party effect
In Gray & Wedderburn’s experiment, about ⅓ of the participants detected their names presented to the unattended ear
This phenomenon of hearing distinctive messages that are not being attended is called the cocktail party effect
Dear Aunt Jane Experiment
“Dear Aunt Jane” experiment
Attend to and shadow the message presented to the left ear
Participants reported healing the message “Dear Aunt Jane”, which starts in the left ear, jumps to the right ear, then goes back to the left
The late selection Model
Dear Aunt Jane EXP. & Cocktail Party Problem challenge the Early Selection Model
Model proposes that all sensory information is processed to a certain degree for meaning before attention selects what to focus on
Contrast with early selection model, which suggests that information is filtered out at an early stage, based primarily on physical characteristics like volume or pitch, because deeper processing for meaning occurs
Visual Search
Feature Search
Search for a target defined by a single attribute
Such as a salient color or orientation
The efficiency of visual search
Is the average increase in RT for each item added to the display
Measured in terms of search slope, or ms/item
The larger the search slope (more ms/item), the less efficient the search
Feature search is efficient
RT is not influenced by set size
Ie: search slope is flat
Conjunction Search
Search for a target defined by the presence ofg two or more attributes
Conjunction search is inefficient
RT increases as the set size increases
Comparison
Feature Search
Efficient
RT not influenced by set size
Parallel processing
Simultaneously processing stimuli
Conjunction Search
Inefficient
RT increases as set size increases
Serial; processing
Attending and processing one item at a time
Quiz:
Which of the following would take the longest?
Finding a green triangle among a mix of 40 yellow squares and green circles
Differs from two features
Conjunction search
Large set size
Finding a green triangle among a mix of 40 green squares and yellow triangles
Challenging conjunction search with a large set size and high similarity between the distractors and the target
Differences from distractors by both colors and shape
Conjunction search
Large set size
Finding a green triangle among a mix of 20 yellow squares and yellow triangles
Differs from distractores by a single feature
Feature search
Smaller set size
Finding a green triangle among a mix of 20 green squares and yellow triangles
Difference from distractors by two features
Conjunction search
Smaller set size
Quiz Answer: B
Feature Integration Theory & Illusory Conjunctions
The binding problem
Perceiving the vertical red bar moving to the right
Color, motion, and orientation are represented by separate neurons
How do we combine these features when perceiving the bar?
Feature integration theory
What color was the #1
What color was the small triangle? Large triangle?
What color was the large circle?
Illusory Conjunctions
Support the idea that some features are represented independently and must be correctly bound together with attention
Attentional Limits in Time
Rapid Serial Visual Presentation (RSVP) Task & Attentional Blink
Attentional blink
The difficulty in perceiving and responding to the second of two target stimuli and a RSVP stream of distracting stimuli
ATTENTION IS LIMITED
Green and Bavelier (2008)
Reported that people who play first-person shooter video games have a reduced attentional blink
This suggests that visual attention performance can be improved with practice
Marvin Chun’s fishing metaphor for attentional blink
You can see all the stuff in the river as it drifts by
Ie: the boot and the fish
You commit to netting fish number 1 (f1).
The boot drifts by and fish 2 (f2) appears
Because you are tired up with f1, you will not be able to capture f2, and that second fish swims away
Detected perhaps, but uncollected
The Physiological Basis of Attention
Brain Areas
Attention to a specific part of the visual field causes neurons coding those locations to have increased activity
Attention can enhance the neural processing of a specific object
The activity of the FFA (“face area”) or PPA (“place area”) is modulated as a function of attended object type (face or house)
Activation in the visual cortex measured with fMRI
Single Cell
Attention
Increases the firing rate of a single cell in the monkey parietal cortex
Different ways in which attention could alter the activity of a single cell
Attention shifts the receptive fields of cells in the monkey parietal cortex
Note the change in the receptive field map when the monkey directs attention to diamond (a) or circle (b)
Quiz
Recall our discussion of the cueing experiment and our ability to attend to specific objects. Based on what you have learned, which of the following do you think would be correct?
People would be equally fast on condition 1 and 2
People would be faster on Condition 1
We are attending to the left target and expect the x to appear in the left target
Faster reaction times
People would be faster on Condition 2
Attention must shift to the target’s exact location, causing a delay
People would fail on both conditions
Disorders of Visual Attention
Quiz Answer: B
Neglect
They are not blind, their vision is perfect
But because of the right side lobe damage, they have a hard time processing left visual field
The inability to attend to or respond to stimuli in the contralesional visual field
typically , neglect of the left visual field after damage to the right parietal lobe
Not restricted to vision:
auditory/somatosensory domains
Different from hemianopia (loss of V1)
5 “slices” through the brain of a patient with neglect MRI viewed as through from above). The damage, shown here in yellow, includes the right parietal and frontal lobe. The patient neglects the left side of space
Object copying task:
A patient with neglect often omits one side of the object
Line bisection task:
A patient with neglect might neglect the lines in the left side of the image following damage to the right parietal lobe
The neglect can be relative to the object, no to the whole scene
Step 1: The patient neglected the left side of the barbell
Step 2: The barbell was rotated through 180 degrees
Step 3: the englect roasted with the object in this example of object-centered neglect
Extinction
The inability to perceived a stimulus to one side of the point of fixation in the presence of another stimulus, typically in a comparable position in the other visual field
Milder form of neglect
Balint’s Syndrome
Thought to be related to neglect (more severe)
It involves:
The inability to perceive the visual field as a whole (simultanagnosia)
Difficulty in fixating the eyes (oculomotor apraxia)
Inability to move the hand to a specific object by using vision (optic ataxia)
Lesions to both sides of the brain (typically posterior parietal cortex)
Summary:
Inattentional Blindness
Cueing Experiment
“Spotlight” and “Zoom lens” models of attention
Early & Late Selection Models
Visual Search
FIT & Illusory Conjunctions
RSVP and Attentional Blink
Attention in the brain: Visual areas and single cells
Disorders of Attention
Concepts:
External Attention
Attention to stimuli in the world
Internal Attention
Our ability to attend to one line of thought as opposed to another or to select one response over another
Overt Attention
Involves directing attention from one place/object to another by moving the eyes
Covert Attention
Involves directing attention without moving the eyes
Voluntary vs Involuntary Attention
Attention is Selective and Limited
Lecture 18- 11/13/24
Exam 2 Information:
11/20 Wednesday 8:30-9:30 - 70 minutes (come 5 minutes early)
Lecture 10 - Lecture 18
50% MC 50% SA
Review Nov 18 (Monday) @ Metcalf AUD @ 5:30
Post questions by Nov 17 11:59
Review:
Q: Feature Integration Theory
The Binding Problem
Perceiving the vertical red bar moving to the right
Color, motion, and orientation are represented by separate neurons
How do we combine these features when perceiving the bar?
Feature integration theory
FIT suggest that the solution to the binding problem is attention
Illusory Conjunctions
Support the idea that some features are represented independently and must be correctly bound together with attention
Q: RSVP and Attentional Blink
Bad performance:
Attentional blink
You can catch first target but miss the second target
The difficulty in perceiving and responding to the second of two target stimuli amid a RSVP stream of distracting stimuli
ATTENTION IS LIMITED
Fishing metaphor
You miss the second fish after catching the first fish if the second one comes at a critical time point
Q: Brain Areas
Attention to a specific part of the visual field causes neurons coding those locations to have increased activity
Attention can enhance the neural processing of a specific object
The activity of the FFA or PPA is modulated as a function of attend object type
Q: neural mechanisms of attention
Different ways in which attention could alter the activity of a single cell
Enhancement
Attention increases the overall firing rate of a neuron in response to its preferred stimulus
Sharper tuning
Reduces response to stimuli that deviate slightly from the preferred feature, enhancing specificity
Altered tuning
Attention shifts the preferred stimulus or feature of a neuron, modifying what the neuron responds to most strongly
Notes:
Conscious perception limited by attention and memory
Motion-Induced Blindness
Attention for awareness
Things may fade out of your attention, disappearing from awareness
ie: concentrate on a book, and you are aware of little else
Attention acts as a gateway to conscious perception determining what enters awareness and what is filtered out
Objects or events not attend to may “disappear” from awareness even if they are within the visual field
Attention limits the scope of conscious perception to a manageable subset of the available sensory information
Inattentional blindness
Failure to notice a fully visible and unexpected object because attention is focused elsewhere
Gorilla experiment’
Demonstrated that without attention, even significant stimuli can go unnoticed
Change Blindness
Difficulty detecting changes between two images or scenes when the change occurs during a visual interrupted or distraction
Ie: a person talking to a stranger fails to notice when the stranger is replaced by someone else if the swap occurs during a brief interruption
Our brains retain a general gist of scenes rathers than detailed representations, relying on attention to detect changes
Changes are often missed unless they are directly attend to or highly salient
Only attended items enter visual short-term memory
Sensory memory (Iconic memory)
Only lasts for 200-500 msec
A kind of photographic memory (no limit)
ADD ATTENTION →
Short term memory (working memory)
Lasts over many seconds
Very limited capacity
ADD REHEARSAL →
Long-term memory
Capacity and duration unlimited
Limited Capacity of visual working memory
You can rember only up to 4 items
Conscious perception is limited
Because your visual attention and memory ar elimited
All these results show that you are only aware of things that you select for your attention and short-term memory
What about unattended things?
Most of them will be decayed, forgotten, and discarded, so you cannot use them
The fate of unseen stimuli
Stimulus below an individual's threshold for conscious perception ios registered and processed without our awareness
Subliminal perception
Only appear for a single frame
Too short to pick up)
Things we don’;t notice influence us too
Republican ad shows Al Foire and then “RATS” appears for one frame (1/30 of a second)
Despite your limited conscious perception
Individual items are something you do not see
Your visual scenes are much richer
Influence by subliminal perception
Invisible stimulus can attract attention
Understanding visual scenes
Gist
Fast visual scene understanding, even when the image is blurred
You can recognize a scene within 20 msec
Spatial layout
To the structural and geometric arrangement of elements within a scene
Overall characteristics of a scene
Openness
Depth
geometry
For global structure of the scene
How can piercing scenes be so fast?
Two different components of a visual scene
Low spatial frequency
coarse
High spatial frequency
Fine
Global information about a whole scene relies on the low-spatial frequency component
Visual system can quickly analyze this information whole we are not aware of it at all
Guided search by global information of a scene
Gist of a scene
Spatial layout of a scene
Ensemble representations
Knowledge about the properties of a group of objects
Mean size
Approximate number
Centroid
Mean emotion
They are about “group” of similar objects
They are useful because the natural scenes often contain many similar objects
Redundancy and Regularity
Ensemble Representations are efficient and economical
Given the limited capacity of attention and memory
Given the remarkable ability to group things together
By similarity
By distance
You use ensemble representations everyday
Global information makes your visual experience of a scene rich and vivid
Without explicit effort, you may know about spatial layout of the structure
You recognize that this is a scene of outdoors, man made, and navigate
You know about groups of similar objects
Buildings, cares, or people
You may not need to attend to and remember every single element of this scene in order to understand the scene
Memory for scenes
Memory for scenes is amazingly good
Participants were shown 10,000 image for 5 seconds each
They were about 90% correct when quizzed 2 days later
Because you can understand visual scenes fast and efficiently
Because you already have so much knowledge about scenes in your long-term memory
Neural basis for scene perception
PPA (Parahippocampal place area)
Retrosplenial complex (RSC)
Complementary function of the PPA and RSC
PPA treats each view of panoramic scene as different images
Viewpoint specific representation
RSC treats different views of panorama as the same stimulus
Togethers, they enable both specific and integrative representations of scenes across viewpoints