1/42
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is object recognition?
perception of familiar items
Why is object recognition difficult?
Because the environment has overlapping objects, and despite large variations in the retinal image (e.g., size, shape, rotation, occlusion), we still perceive coherent, stable objects.
What are the types of invariance important for object recognition?
Translation invariance: brain’s response doesn't change when the input is shifted in space.
Rotation invariance: its response doesn’t change when the input is rotated.
Size invariance: brain recognises an object regardless of its size
Colour: brain recognised regardless of changes in its colour.
Partial occlusion + Presence of other objects: object is partly hidden behind another object, yet brain can still recognise the whole object.
POO: context or nearby objects can affect how brain a target object.
What is intraclass variation in object recognition?
Recognition of different objects within the same category (e.g., wooden, plastic, and bean bag chairs all still being recognised as "chairs").
What is viewpoint variation?
Recognising an object even when it looks different from various angles or perspectives.
What is the template theory of 2D pattern matching?
The brain stores mini copies (templates) of all known patterns and matches incoming stimuli to them using normalisation (transforming the incoming visual input to templates for easier matching)
What is a real-life example of template matching?
Barcodes and fingerprints — matched to stored templates in a database.
What is a problem with basic template theories?
Cannot account for flexibility of pattern recognition system (when there is 0 overlay between stimuli + template)
What is the prototype theory of 2D pattern matching?
Instead of multiple templates, the brain stores an average (prototype) of an object and compares new stimuli to this general representation.
What evidence supports prototype theory?
Franks & Bransford (1971) — participants were confident they saw prototypes they had never actually seen.
What is a limitation of prototype theory?
Not every object may have a clear prototype.
What are feature theories of 2D pattern matching?
Recognition happens by detecting key features (edges, angles, lines) rather than whole shapes.
How do feature theories differ from prototype theories?
Feature theories focus on individual features; prototype theories focus on matching the object to a general average representation.
What are structural description theories?
They recognise objects based on individual features and how those features are arranged relative to each other.
What is the first step of 3D object recognition according to Marr’s computational model?
Interpreting visual input as separate, cohesive structures distinct from the background.
What are the key questions in object recognition?
What are the primitive elements in the description?
How are these elements related?
How is the description invariant across views?
How does viewpoint dependence affect recognition?
What did Marr & Nishihara (1978) propose about 3D object recognition?
They proposed that objects are represented by a hierarchical organisation of cylinders, each defined by an axis and relationship to other cylinders, resulting in descriptions that are invariant across viewpoints.
What is Recognition-by-Components Theory (Biederman, 1987; 1989)?
A theory suggesting objects are recognised by breaking them down into simple 3D shapes called geons (geometrical ions), making recognition mostly viewpoint invariant.
What are geons and how many are there?
Geons are basic volumetric shapes like blocks, cylinders, arcs, and wedges; there are approximately 36 geons.
What are the key structural relationships between geons?
Relative size = how big one part is compared to another
Verticality = whether parts are stacked vertically or oriented up/down
Centring = whether one part is centred on another
Relative size of surfaces at join = the size where two parts meet
What are non-accidental properties used to define geons?
Properties that stay the same no matter which angle you view an object from
Curvature - points on a curve
Parallel - set of points in parallel
Co-termination - edges terminating in a common point
Symmetry - versus asymmetry
Co-linearity - points in a straight line
How does edge detection help in Recognition-by-Components theory?
Detects edges and concavities to find parts.
Identifies geons based on invariant clues.
Matches components to stored object representations.
What did Biederman’s 1987 experiment show about object recognition?
Deleting key edges slowed recognition.
Recognition was better when midsegments were deleted at longer exposure.
Findings support the geon theory.
What did Vogels et al. (2001) find about neurons and geons?
Some cortical neurons in monkeys' inferior temporal cortex responded more strongly to geon changes than to changes in object size.
What are the strengths of Recognition-by-Components theory?
Flexible and comprehensive.
Small set of primitive shapes (parsimonious).
Experimental support.
What are criticisms of Recognition-by-Components theory?
Arbitrary number of geons (why 36?).
Doesn’t explain matching process well.
Surface details and context can be more important in some cases.
Simplifies viewpoint dependence.
What is viewpoint dependent theory?
A theory suggesting that changes in viewpoint reduce the speed and/or accuracy of object recognition because objects are stored as specific views.
When is viewpoint dependence more important?
For complex within-category discriminations, like telling apart different types of cars.
When is viewpoint invariance more likely used?
For easy categorical decisions, like recognising a chair.
What did Tarr & Hayward (2018) argue about object representations?
They argued that object representations are neither strictly viewpoint-dependent nor strictly viewpoint-invariant.
What is the binding problem in object recognition?
It’s the challenge of correctly integrating different features (like a handle and a spout) into one coherent object representation.
What happens after a structural description of an object is formed?
It must be matched to stored representations; if there is a match, the object is recognised.
What model explains how object recognition continues after forming a structural description?
The cascade model based on Humphreys et al. (1988).
What is a key idea of the cascade model of object recognition?
Stages are connected — problems at one stage can affect later stages (not fully independent).
What are the four key stages in the cascade model?
Structural descriptions – perceptual analysis of the object's shape.
Semantic representations – conceptual knowledge about the object (e.g., category, function).
Name representations – lexical/word form of the object.
Name – output (saying or thinking the object's name).
Why is the cascade model considered an oversimplification?
Because later processes can start before earlier ones are fully complete.
What kind of patient evidence supports the cascade model?
Patients with associative agnosia — e.g., Patient JB struggled to name visually similar objects (birds, animals), which also made it harder to categorise them.
What is visual agnosia?
A condition where feature processing and memory remain intact, but recognition deficits are limited to the visual modality.
What abilities are unaffected in visual agnosia?
Alertness, attention, intelligence, and language remain unaffected.
How can individuals with visual agnosia still recognise objects?
Through other sensory modalities like touch or smell, even though vision fails.
What is apperceptive agnosia + what type of deficit occurs
A problem with later processing — where recognition fails even though the visual representation is intact.
A perceptual deficit — components of a visual image are picked up but cannot be integrated into a whole.
affects the structural description stage
How do perceptual deficits in apperceptive agnosia typically present?
The effects are often graded and may particularly affect recognition from unusual views of objects.
What happens in associative agnosia regarding visual representations?
Visual representations are intact, but cannot be accessed or used for recognition — there's a lack of semantic information about what is seen.
affects semantic representation stage