1/59
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
What are the challenges to object recognition?
Going from a 3D word to a 2D retinal image
Viewpoint and orientation independence
Distinguishing objects that share features
Recognizing objects without all the parts visible
Going from a 3D word to a 2D retinal image - challenge
Although the retina only captures 2D images, the brain must interpret these images in a way that accurately represents the 3D dimensional world around us
Viewpoint and orientation independence - challenge
Even though an object can look very different depending on the angle or position from which we view it, our brain can still recognize it as the same object despite those changes in its retinal image
Distinguishing objects that share features - challenge
Objects look almost identical - they have the same colour, size, shape - but we can still tell them apart because of their spacial relationship (where they’re located)
Recognizing objects without all the parts visible - challenge
We can still identify an object even when parts of it are hidden or missing
If a cat is partly behind a fence, we don’t just see random shapes - we still recognize it as a cat
Brain fills in the missing information based on our knowledge and experience
What are the Gestalts laws?
Closure
Proximity
Continuation
Figure and ground
Similarity
Closure - Gestalt law
How we tend to “fill in gaps” in incomplete visual information to perceive a whole, complete object
If you see a circle with a small gap in it, you’ll still perceive it as a circle rather than a random curved line - because your mind closes the shape
Proximity - Gestalt law
How we tend to group together elements that are close to each other in space
if you see a bunch of dots arranged in rows, you’ll automatically perceive each row as a group because the dots within each row are closer together than to those in other rows
About distance
Continuation - Gestalt law
We tend to perceive elements in ways that follow smooth, continuous paths rather than abrupt changes in direction
In a logo with a swooping curve, your eye will automatically follow that curve through the design, even if parts of it are missing
Similarity - Gestalt law
Group elements together that look alike - whether thats in shape, colour, size, texture, or any visual quality
if all the blue dots are spread out among red dots, you’ll still see the blue ones as a group because they share the same colour
About appearance
Figure and ground - Gestalt law
Described how we separate visual scenes into 2 parts
figure: main object or focus of attention
ground: background or everything else behind it
Rubin’s Vase illusion
you either see a vase (the figure) or 2 faces (the ground), but not both at once
your perception flips depending on which part your brain decides to treat as the figure
What do Gestalt laws enable us to achieve?
The capacity to recognize objects for what they are despite dramatic changes in the retinal image
This refers to object constancy
What is a bistable image?
Visual information supports multiple interpretations
duck vs rabbit example
What is object constancy?
The brain is capable of recognizing objects despite huge changes in the retinal image
change in the lighting conditions
orientation
viewers perspective
What is the order of image visualization in the hierarchical nature of visual processing? (ventral ‘what’ pathway)
Retina
LGN
V1
V2
V4
IT (Inferotemporal Cortex)
Retina
First stage of visual processing
Contains retinal ganglion cells - responding to a small, localized region of the visual field - known as receptive field
These cells detect basic light patterns, contrast, and edges
Each retinal cell processes a tiny portion of what you see, so no single cell “sees” the whole picture
LGN
Receives input from multiple retinal ganglion cells
Each LGN neuron combines signals from these retinal cells, forming a larger receptive field
Preserving spatial information and refining contrast and brightness differences before sending it to the cortex
Still handles low-level visual features
V1
Primary visual cortex
First cortical area to process visual input
Many LGN neurons project to a single V1 neuron, resulting in larger recpetive fields
Specialized for detecting orientation, direction of motion, edges, and spatial frequency
Starts to represent the structure of the visual world, identifying where edges and patterns occur
V2
Sits just beyond V1 and begins to integrate the simple features detected earlier
Pattern organizer, combining lines into meaningful shapes or surfaces
Start to represent the whole object shape rather than disconnected lines
V4
Integrates input from V1 and V2 to process colour
Have larger receptive fields and are sensitive to specific colour combinations
IT
Inferotemporal cortex
Highest stages in the visual hierarchy
Receives input from many V4 neurons - meaning each IT neuron integrates massive amounts of visual information from large areas of the visual field
Allows object and category recognition
Visual perception becomes semantic - recognizing what the object is
What is orientation tuning? - Basic preferences
Individual cells showed firing rates to specific orientations and directions of motion - so the cell could be said to ‘prefer’ that orientation
Cells were also arranged in a columnar format - so columns of cells showed the same orientation preference
Columnar organization is common elsewhere (V5/MT) - motion sensitive regions (like a bird flying or a ball rolling)
Explain the complex preferences in the ventral stream?
The neurons fire strongly when the monkey sees hands, even if they’re covered by a mitten (so the neuron recognizes the “hand-ness”, not just visual details)
When shown non hand objects, even ones designed to look like a hand, the neurons doesn’t respond
LOC - object form
V4/V8 - colour
V5/MT - motion
AIT (anterior) - complex object representations
What is the fovea?
Tiny central part of your retina where you have the sharpest vision
Why is the cell density highest at the fovea?
There is a lot of neurons dedicated to the fovea, because we tend to look directly at the things we want to clearly see
High density = more processing “power” devoted to central vision
Peripheral vision gets fewer neurons, so its less detailed
Why do receptive fields size increase with distance from the fovea?
Receptive field is the region of space a neuron looks at
Near the fovea the RF are small → neurons are picky and respond to fine details
Farther from the fovea, RF are larger → neurons respond to bigger chunks of space, less detailed
This is true even early in the visual pathway, like in V1
Why do receptive fields size increase in more anterior regions of temporal cortex?
As you move forward along the ventral stream (V1 → V4 → IT), RF keep getting larger
V1 has higher density in cells compared to V4
By the time you reach the IT, RF are huge and always cover the fovea
What are the advantages of having larger receptive field sizes in IT?
Allows neurons to respond to objects regardless of their location in space or objects size
Can respond to the global shape (as opposed to local features only) of an object
IT cells always include fovea - this means that central vision is always represented
anterior portions of dorsal stream have larger RF but only 60% include fovea
What is a disadvantage of having smaller receptive field size?
Ganglion cells with very small RF give the brain information about only a very small portion of the image
What is modularity in vision?
Visual system is organized into distinct, specialized processing units (or modules), each responsible for analyzing a particular type of visual information
What is grapheme-colour synaesthesia?
Perceive letters and numbers as they normally appear, but also experience them as being associated with specific, consistent colours
What are the possibilities underlying synaesthesia?
Reduced synaptic pruning
Abnormally strong back projections
Reduced synaptic pruning
As you grow the brain gets rid of synapses as you learn, keeping the ones that are useful
If someone has less pruning, their brain keeps extra connections between areas that usually shouldn’t communicate
For synaesthete (sees the number 7 and associates it with red
the number area of the brain (that recognizes digits) and the colour area (that processes colour) might still be physically connected
Senses bleed together
Abnormally strong back projections
Re-entrant activation
When the part of the brain that recognizes the number 7 becomes active, that activity loops back into areas that normally process colour, creating a link between them
Feedback loops - higher areas can send signals back to the lower ones
How does syntesthetes improve memory?
Shown a large set of letters and numbers in black or in colours that match their usual synaesthetic associations, they can recall signifinalty more information than non-syntesthetes
When the same info is presented in colours that dont match their typical experiences, their memory performance drops to the same level as neurotypical people
What is Marr’s computational model?
Brain recognizes objects by building up representations step by step - first processing basic visual information (edges or contrasts), then constructing a 2.5D sketch (a description of surfaces and depth from certain viewpoint), and forming 3D model thats view-invariant
What is view dependent recognition?
Your brain would recognize objects based on the exact way they look from specific angles
If you saw a bike from every possible angle - below, front, side, etc.
Too much for the memory - need to store many templates for every objet you’ve ever seen
What is view invariant (independence) recognition?
Your brain ignores angles - it recognizes the essential features or structural properties that stay the same no matter how the object is rotates
Recognizes a bike from the top - you can extract the principal axes of the frame and handlebars
What is the Irving Beiderman recognition by components theory?
Objects are combinations of parts or neons, form a kind of visual/perceptual alphabet
An object then, is defined by the unique set and arrangement of neons
What is the problem with Irving Beiderman recognition by components theory?
How do we recognize objects with very different geons (a rotary dial phone vs. a cell phone) as the same object (a phone)?
How is that can we distinguish between objects that share many, if not most or all, of the same geons?
What is the grandmother cell?
Proposes that there could be individual neurons responsible for recognizing specific objects or people
A single neuron that activates only when you see your grandmother
Extending this idea would imply the existence of “table cells,” “phone cells,” etc
What are the limitations of the grandmother cell?
If recognition depended on a single neuron, brain damage could completely erase the ability to recognize certain objects or individuals
People may lose the ability to recognize someone visually but can still identify then by voice or other cues
Fails to explain how we can recognize unfamiliar faces or new objects without pre existing cells for them
What does object recognition rely on instead of the “grandmother cell hypothesis”?
Distributed neural representation, rather than single specialized cells
Recognition emerges from patterns of activity across many neurons working together
Each neuron contributes to multiple representations, and each object or face is encoded by the collective activity of many cells
Recognition via synchronized firing across multiple regions
What is apperceptive agnosia?
Processing of visual properties such as brightness, colour, and texture remain intact
Struggle to accurately match shapes, can not copy simple line drawings
Can not name objects based on visual information alone
But they can name things using other senses
Animal noises or naming an object they hold in their hand
Shows they dont have deficits in semantic or memory processes
What is associative agnosia?
Have difficulty representing the form of what they see
They can copy objects and their matching ability remains largely intact
They have trouble attaching an appropriate label to what they see
Fail to name objects accurately despite being able to draw them correctly and the struggle to make drawings from memory alone
What is category specific agnosia?
Specific form of associative agnosia
Patient may fail to recognize living things but have less difficulty recognizing non-living things
What is the reason for category specific agnosia?
Tools and man made objects share fewer things in common than do living things, which commonly have appendages (arms, legs, claws, tentacles), heads and bodies
similar items active overlapping areas of the brain
man made items vary more in how they look and how they’re sed, so they likely rely on different brain networks - less vulnerable to damage
What is the psychological neighbour theory?
Mike Dixon studied a patten with visual form agnosia arising from herpes encephalitis
Had category specific agnosia for a specific class of musical instruments - strings not the brass
Objects that share some property (e.g., perceptual, action affordances, function, etc) would be closer together in a physiological neighbourhood
more susceptible to damage
What is prosopagnosia?
Most common category specific visual agnosia
Patient can’t recognize faces using vision alone
Recognize the image as a face but can’t associate it with a specific identity
Can recognize people through other modalities such as their voice
What did Ekman claim about faces?
Ekman showed many years ago, some facial expressions are seemingly universal
There was a notion that there are only 6 classes of emotional expression (sadness, happiness, anger, fear, and surprise
How did monkey firing rates say about Ekman’s claim about faces?
Strong firing rates with certain images (faces)
Very little response to a hand or a scrambled image
Some images where the mouth and eyes are obstruct, still elicit strong responses
Research has shown that certain areas of the primate brain respond more strongly to faces than to other types of visual stimuli
Who is Guiseppe Archimboldo?
Made art that incorporated face structure in objects that don’t normally have them
He creates an art piece that was a bowl of vegetables that looked liked a face
What is the face inversion effect?
Visual perception showing that people are much worse at recognizing faces when they are upside down compared to other objects
What is holistic processing
Perceiving the overall arrangement of facial features - how eyes, nose, mouth relate to each other
This is the dominate way humans recognize upright faces
What is featural processing?
Recognizing a face by its individual features independently
Works well for many objects because you can identity them from distinct parts
Faces can be processsed this way, but its less efficient, especially for subtle differences
How does featural and holistic processing connect to the face inversion effect?
Inverting a face disrupts holistic processing, because the usual spatial relationships are harder to perceive
People then rely more on featural processing, which is slower and less accurate for faces
How does holistic memory effect faces and houses?
Asked to study a range of faces and houses
Later on they are asked to do a recognition memory test - presented with either the whole image or only a part of the image
Faces are poorly recalled when only part of the face was available
Houses show no such deficit
What is the FFA, and is it only for faces?
Fusiform Face Area
Region of the brain thought to be specialized for face recogntio
Questioning if its truly face-specific, or could it also respond to objects we have extensive experience with
Are humans “face experts,” or is it about visual memory capacity?
We have exceptional visual memory especially for faces
A study has shown that people can study thoughts of images in a single day and later recognize around 90% if them
Questioning if this suggests that face recognition is just a reflection of our broader visual memory abilities?
Does the FFA reflect face specialization or general expertise?
The result suggests that the FFA may not be strictly for faces
It may respond to categories for which we develop expertise
indicating that face selectivity could be partly reflect experience rather than innate specialization