Chapter 11: Perceiving Objects and Scenes

  • How do we organize visual scenes?

    • Designing a “perceiving machine”

    • Vision is complex - early “self-driving” vehicles used cameras

    • Current “self-driving” vehicles used cameras

    • No vehicle is truly “self-driving” - they have advanced driver assisted systems (NTSB)

    • We’re going to focus on human vision

  • What are problems identifying objects?

    • Stimulus on the receptors is ambiguous

    • Objects can be hidden or blurred

    • Objects look different from different viewpoints

  • Inverse Projection Problem

    • Image on retina is 2D

    • Objects are 3D

    • Many objects can create same image on retina

    • How do we know which is the correct image?

    • Ambiguous

  • Hand Shadow Puppets

    • Good example of vision with limited info

    • Create image from flat shadow

  • Facial Recognition

    • Viewpoint Invariance - recognize face from different viewpoints

    • Humans are very good at this

    • Ability peaks in the 30s for facial recognition

    • Difficulty with different racial groups than your own

  • Computer Facial Recognition

    • Can recognize you even if face is blurred

    • Looks at multiple faces, clothing, etc.

    • Performs worse than non-blur

  • Theories of Object Recognition

    • Gestalt principles of organization

      • “The whole is different from the sum of the parts”

      • Emphasis on top-down

    • Recognition by components theory

      • Geons - basic components of objects

      • Emphasis on bottom-up

  • Origins of Gestalt

    • Apparent motion

      • Max Wertheimer - perception arises from sensations

      • No actual motion involved, so how can the perception of motion arise?

  • Gestalt Principles

    • Principles of Perceptual Organization

      • Pragnanz - good figure, simplest figure is best

      • Closure - small gaps are overlooked

      • Similarity - similar items get grouped

      • Good Continuation - follow a smooth curve rather than abrupt

      • Proximity - items close together get grouped

      • Common Region - items seen as in same region get grouped

      • Uniform Connectedness - similar properties get grouped

      • Synchrony - items that occur at same time get grouped together

      • Common Fate - items moving in same direction get grouped

      • Meaningfulness/Familiarity - things that form familiar patterns get grouped

  • Perceptual Segregation

    • Also known as figure ground segregation

    • What is the figure and what is the ground

    • Must look at properties of figured and grounds

    • Useful to look at reversible figure ground images

  • Figure Ground

    • What makes the figure?

    • What makes the ground?

    • Shared contours

    • Border Ownership - whoever owns the border is the figure

  • Reversible Figures

    • Symmetry

    • Size

    • Orientation

    • Meaning

  • Gestalt Summary

    • Heuristics not principle or algorithms - rule of thumb

    • They’re quick

    • They’re correct most of the time but not all

    • Reflect properties of the environment - they work

    • Used frequently to make logos and artwork

  • Recognition by Components Theory

    • Geons - “geometric ions”

    • Basic components of objects

    • Proposed by Irving Biederman

    • 36 different geons - represent most shapes

    • Accidental versus non-accidental viewpoints of objects

  • Object Viewing

    • Accidental Viewpoint - a viewpoint an object is not normally viewed at

    • Non-accidental property - a property that occurs except when you straight on or at accidental viewpoint e.g. curved edge of coin

    • Ex: 3D sidewalk drawings use accidental viewpoint

  • Non-Accidental Properties (NAP)

    • Each geon has a unique set of NAPs

    • Discriminability - each geon can be discriminated from others

    • Principle of Componential Recovery - We can identify an object if we identify its geons

robot