How do we organize visual scenes?
Designing a “perceiving machine”
Vision is complex - early “self-driving” vehicles used cameras
Current “self-driving” vehicles used cameras
No vehicle is truly “self-driving” - they have advanced driver assisted systems (NTSB)
We’re going to focus on human vision
What are problems identifying objects?
Stimulus on the receptors is ambiguous
Objects can be hidden or blurred
Objects look different from different viewpoints
Inverse Projection Problem
Image on retina is 2D
Objects are 3D
Many objects can create same image on retina
How do we know which is the correct image?
Ambiguous
Hand Shadow Puppets
Good example of vision with limited info
Create image from flat shadow
Facial Recognition
Viewpoint Invariance - recognize face from different viewpoints
Humans are very good at this
Ability peaks in the 30s for facial recognition
Difficulty with different racial groups than your own
Computer Facial Recognition
Can recognize you even if face is blurred
Looks at multiple faces, clothing, etc.
Performs worse than non-blur
Theories of Object Recognition
Gestalt principles of organization
“The whole is different from the sum of the parts”
Emphasis on top-down
Recognition by components theory
Geons - basic components of objects
Emphasis on bottom-up
Origins of Gestalt
Apparent motion
Max Wertheimer - perception arises from sensations
No actual motion involved, so how can the perception of motion arise?
Gestalt Principles
Principles of Perceptual Organization
Pragnanz - good figure, simplest figure is best
Closure - small gaps are overlooked
Similarity - similar items get grouped
Good Continuation - follow a smooth curve rather than abrupt
Proximity - items close together get grouped
Common Region - items seen as in same region get grouped
Uniform Connectedness - similar properties get grouped
Synchrony - items that occur at same time get grouped together
Common Fate - items moving in same direction get grouped
Meaningfulness/Familiarity - things that form familiar patterns get grouped
Perceptual Segregation
Also known as figure ground segregation
What is the figure and what is the ground
Must look at properties of figured and grounds
Useful to look at reversible figure ground images
Figure Ground
What makes the figure?
What makes the ground?
Shared contours
Border Ownership - whoever owns the border is the figure
Reversible Figures
Symmetry
Size
Orientation
Meaning
Gestalt Summary
Heuristics not principle or algorithms - rule of thumb
They’re quick
They’re correct most of the time but not all
Reflect properties of the environment - they work
Used frequently to make logos and artwork
Recognition by Components Theory
Geons - “geometric ions”
Basic components of objects
Proposed by Irving Biederman
36 different geons - represent most shapes
Accidental versus non-accidental viewpoints of objects
Object Viewing
Accidental Viewpoint - a viewpoint an object is not normally viewed at
Non-accidental property - a property that occurs except when you straight on or at accidental viewpoint e.g. curved edge of coin
Ex: 3D sidewalk drawings use accidental viewpoint
Non-Accidental Properties (NAP)
Each geon has a unique set of NAPs
Discriminability - each geon can be discriminated from others
Principle of Componential Recovery - We can identify an object if we identify its geons