1/94
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Object and Scene Perception
The process of interpreting visual input into meaningful information that can be perceived
Involves coding visual features into units and then interpreting those units as objects
Why is it difficult to emulate human abilities in machines?
Machines are unable to decipher every obstacle as being dangerous or not. They are not able to be coded for every detail.
Ambiguity
Hidden or blurred objects
Ambiguity
An object’s appearance changes because of different viewpoints, occlusion, and noise
Hidden/ Blurred Objects
Objects may be obstructed by other objects in the environment or can be blurred out.
Viewpoint Invarience
The ability to recognize an object regardless of viewpoint
Why Design a Perceiving Machine?
Rescue robots
Driving robots
Surgical purposes
Reduce accidents
Structuralism (Wundt)
Perceptions are created by combining elements called sensations (basic sensations)
Gestalt (Wertheimer)
Perception is a result of perceptual organization, perceiving the whole.
What two pieces of evidence is used to invalidate the structuralists theory of perception?
Apparent movement and illusory contours
Perceptual Organization
The process by which elements in a person’s visual field become perceptually grouped and segregated to create perception
2 Components of Perceptual Organization
Grouping - putting together
Segregating - see two separate things
Apparent Movement
An illusion of movement that occurs when two objects separated in space are presented rapidly, one after another, separated by a brief time interval (viewed as a whole)
Illusory Contours
Contour that is perceived even though it is not present in the physical stimulus (no physical edge)
Neural data shows that illusory contours are processed in the V2 (secondary visual cortex)
Von der Heydt
Showed that edge detection cells respond to the illusory edges as strongly as real edges
But did not fire if no edge was implied
Good Continuation
Connected points resulting in straight or smooth curves that are seen as belonging together
Principle of Prägnanz
Perceive complex things as simplified forms to easily recognize and understand what they say, because it is the interpretation that requires the least cognitive effort from us (Prägnanz = “good figure” or “pithiness”)
Similarity
Similar things appear to be grouped together
Proximity
Things that are near to each other are grouped together
Common Fate
Objects move in the same direction as a group
Figure Ground Segregation
Determining what part of the environment is the figure so that it “stands out” from the background
Figure: More “thinglike” and memorable than the ground; seen in front of the ground
Ground: More uniform and extends behind the figure; surrounds the figure
Findings from Vercera et al. (2002)
Information within the image determines perception grouping
Areas lower in the field of view are more likely to be perceived as a figure
A figural cue for segregation
Findings from Max Wetheimer (1912)
A - seen as a W & M, using our knowledge to perceive
B - seen as a pattern, so principles of continuation override knowledge (connected points)
Built-in perceptual organization can override knowledge
Findings from Gibson & Peterson (1994)
Figure-ground formation can be affected by the meaningfulness of a stimulus
The black part is seen as a figure when upright and appears like a woman, but does not appear that way when it’s viewed upside down
Meaningfulness can influence identifying a figure in an image
Recognition by Components
A theory stating that object recognition occurs by representing each object as a combination of basic units
Geons
The visual system breaks down objects into geometric units or geons
Geons (40) - basic units of objects consisting of simple shapes such as cylinders and pyramids
Recognition by Components - Advantages and Disadvantages
Advantage
Accounts for viewpoint invariance - objects are seen as the same regardless of the vantage point
Disadvantages:
Doesn’t account for all types of object recognition, like face recognition, oddly shaped objects (like clouds)
Doesn’t allow for fine-grained discriminations (two types of dogs have the same shape but are different breeds)
Scene Perception
A scene contains background elements and objects organized in meaningful ways with each other and the background
Scene - acted within
Objects - acted upon
Potter (1976) - Method
Showed that people can perceive the gist of an image when it is presented for 250 ms
She first presented either a target photograph or, as shown here, a description, and then rapidly presented 16 pictures for 250 ms each. The observer’s task was to indicate whether the target picture had been presented
Masking Procedure
A stimulus is covered by a random pattern to eliminate the persistence of vision
Persistence of vision: perception of any stimulus persists for about 250 ms after the stimulus is physically terminated
Fei-Fei (2007)
Used masking to show that the overall gist is perceived first, and then followed by details
27 ms - tell between dark and light
67 ms - identify large objects
500 ms - smaller objects and details
Fusiform Face Area (FFA)
A region in the brain’s interior temporal lobe that is responsible for processing faces in the fusiform gyrus
Nancy Kanwisher et al. (1997) - Functional Localizer
Identifying the region of interest (ROI) with fMRI
“The only region in which most of our subjects (12/15) showed a significantly greater activation for faces than objects was in the right fusiform gyrus.”
Confounds in Nancy’s Study
Luminance, Categories, Human Body Parts, and Orientation of the Face
What type of representation does Nancy’s study support?
The idea of a specialized area only meant for one thing, supports the hypothesis of modularity
Parahippocampal Place Area (PPA)
Located in the temporal lobe, the PPA processes visual scenes and environments (landscapes, buildings, rooms)
Tong et al. (1998)
The procedure used for binocular rivalry is to present a photo of a person's face to one eye and a photo of another object( such as a house or a bike) to the other eye. Tong et al. used colored glasses to allow the face to present only in one eye and the object in the other. The results of this procedure displayed that changes in perception and changes in brain activity mirrored each other. The two important brain areas highlighted in this research are the fusiform face area (FFA) and parahippocampal place area (PPA) because when faces were perceived the FFA had visible activity and when the houses were shown the PPA would show activity. This finding is important because it provides evidence as to how the brain consciously perceives faces versus objects or places.
Binocular Rivalry
The observer perceives either the left or the right image, but not both at the same time
Expertise Explanation of the FFA Area
Novice and Expert groups
Brain scans taken while looking at faces and greebles
Greebles - artificial objects designed to be used as stimuli in psychological studies of object and face recognition
Novices show less greeble FFA activity than the experts
Isabel Gauthier showed we are experts of faces
Voxel
A little cube of brain area, and each voxel contains a single number representing the signal measured at that location
Haxby et al. (2001) - Within vs Between
Wanted to see whether the pattern response in the ventral vision pathway could be distinguished from the patterns of responses by all of the other categories
Within Category Correlation
to determine the pattern of responses to a specific exemplar
Between Category Correlation
to determine the pattern of responses to one category
If the within correlations are bigger than the between correlations, then that was used as evidence for distinct patterns of neural responses, which was true for all categories
Haxby et al. (2001) - Correlations
Got rid of the most active voxels
Within correlations were not reduced that much even with the FFA voxels removed
Holds true for the other categories as well
Haxby et al. (2001) - Results
Distributed representation of categories - Haxby et al. (2001) provided evidence of a distributed representation for faces, because even with the FFA removed there was still a large within correlation
Attention
Process of focusing on a specific object while ignoring others
Overt vs Covert Attention
Overt: looking directly at the attended object
Covert: attention without looking
Selective Attention
Focus on a specific object or activity while ignoring distractions (can be overt or covert)
Spatial Attention
Attended to a specific location in space (can be overt or covert)
Dichotic Listening Task - Cherry (1953)
The participant listens to different audio inputs in different ears and shadows back what they are hearing
Shadowing
Repeat what you are hearing
Finding from the Dichotic Listening Task
Also known as the cocktail party effect
The attended ear allows the participant to shadow that they hear, but the unattended ear will not remember specific details but will remember physical characteristics of the audio input.
ADHD and Dichotic Listening
ADHD group was impaired at reporting words in the selected ear as compared to controls
Demonstrating a deficit in selective attention
Broadbent Filter Model of Attention
Cherry’s work led to this early attention theory
How it is possible to focus one message
Posner (1978) - Pre-Cueing Procedure
The precueing procedure starts with the participants looking at a fixation point to prevent overt attention and then precue with an arrow. Stimuli is either consistent (valid) or inconsistant (invalid). Valid trails are when you are pre-cued to the side of the target which in turn produce a faster reaction time. Invalid trials are when the cue indicates a target will appear in a location that is different from where it actually does.
Posner (1978) - Findings
Information processing is more effective at the place where attention is directed because it enhances information at a location. Invalid cues lead to a slower response because the brain needs to suppress the incorrect response and initiate the response for the right side.
What does Posner’s Study Demonstrate about Spatial Attention?
Helps us to understand attentional processes like alerting (readiness to respond) or orienting (selection)
Important for navigating our world
Spatial Attention and Aging
“Slower responses in the invalid relative to the valid condition became greater with the increase in age.”
It takes older adults longer to process the invalid cue and reorient their attention
Binding
Process by which features are combined to create a perception of coherent objects
Binding Problem
Features of objects are processed separately in different areas of the brain
Illusory Conjunctions
Features that should be associated with an object become incorrectly assocaited with another
Triesman and Schmidt (1982) - Findings
Found the effect of divided attention on feature integration
Divided attention: completing multiple tasks at once, like having to report the numbers and shapes
Incorrect associations occurred 18% of the time
Feature Integration Theory
Explains how an object is broken down into features and how these features are recombined to result in perception
2 Stages of Feature Integration Theory
Pre-attentive stage: features of objects are separated
Focused attention stage: features are bound into a coherent perception
Visual Search
Looking for an object among the objects
Feature Search (parallel) vs Conjunction Search (serial)
Feature Search (parallel): when the target can be found by one feature (green line)
Rapid; pop-out effect; does not require attention; everything is processed simultaneously (parallel)
Conjunction Search (serial): search for two features (horizontal and green line)
Slower; needs attention; search serially
Balint Search
Balint’s Syndrome: parietal lobe damage, which reduces the ability to shift and focus attention
A patient could complete task A, but failed task B
Couldn’t combine features with attention
Attention at a particular location isn’t required for single feature detection
Proves illusory conjunctions
Visual Scanning - Fixations vs Saccades
Each time you paused on one face, you were making a fixation, which allows us to focus
When you move your eyes to another face, then you make a saccadic eye movement - a rapid eye jerky movement from one fixation to the next
What directs our attention?
Stimulus salience: physical properties of an object that direct our attention to that object (e.g., color, loudness, contrast, movement)
Attentional Capture
Process of involuntarily directing our attention to a particular object because of stimulus salience
Saliency Map
Characterizes the saliency of objects or areas within a scene
Parkhurst et al. (2002)
Initial fixations are determined by saliency
Subsequent fixations are determined by top-down factors, such as observer interest
Selection Based on Cognitive Factors
Picture meaning and observer knowledge
Fixations are influenced by this meaning or knowledge
Task demands: can affect the directing of attention
Yarbus (1967): eye movements change depending on the goal
Inattentional Blindness
The failure to see fully visible objects or events in the visual display because one’s attention is focused elsewhere
Example: Gorilla video (46% fail to notice visually salient gorilla)
Change Blindness
The failure to detect obvious changes in a scene when vision has been disrupted
Flicker test with masking
J.J. Gibson
Questioned the traditional lab approach
Felt it was too artificial (observers weren’t allowed to move their heads)
Unable to provide an explanation for many things (e.g., how pilots use their environment to land planes)
Suggested we needed to investigate senses together, not individually (balanced)
Came up with the ecological approach
Ecological Approach
Emphasizing the study of moving observers to determine how their movement results in perceptual information that both creates perception and guides further movement
Ecological Validity
When an experiment’s stimuli, conditions, and procedures match the natural world
Optic Flow
The apparent motion of objects as the observer moves past them (perceptual cue processed for self-motion control)
Gradient Flow
Provides information about how fast we are moving
Faster near observer and slower further away (provides clues about our speed)
Change in this relative difference provides information on speed (smaller difference between far away and near equals a slow speed)
Focus of Expansion
No flow at the destination point
This is a type of invariant information: remains constant regardless of whether the person changes their final destination point. It will always be perceived as a non-moving point
Self-Produced Information
When a person makes a movement, that movement provides information, which guides further movement (reciprocal relationship)
Bardy and Laurent (1998)
Study on novice and professional gymnasts doing somersaults with their eyes closed
The vestibular system is the part of the inner ear that controls balance
Maintaining balance is difficult with your eyes closed, because vision works with the vestibular system, which provides information to help your muscles make adjustments
Lee and Aronson (1974)
Swinging room experiment
Demonstrated how visual information influences balance by overriding the vestibular system because no actual movement was occurring
Walking - Visual Direction Strategy
Observers keep their bodies pointed toward a target (correcting themselves when they drift left or right)
Blind Walking
Shows that people can navigate without any visual stimulation from the environment
Philbeck (1997): “Blind walking” procedure
People observe a target object located up to 12 meters away, then walk to the target with their eyes closed
Accomplished by combining memory of position with knowledge of movements
Observers use spatial updating, which is keeping track of their position as they move
Way Finding
Navigating through an environment (like driving from Austin to LA without a GPS)
Landmarks
Objects on the route that serve as cues to indicate where to turn
Hamid et al. (2010) - Findings
Removing landmarks hinders wayfinding
Non-decision vs decision landmarks
Landmarks looked at most to help with navigation, when removed, then there was a drop in performance
Jazen & van Turennout (2004) - Findings
Greatest brain activation for objects at decision points (landmarks) was in the parahippocampal gyrus (spatial memory and navigation) at test
Brain encodes landmarks at decision points automatically, even if they aren’t remembered
Brain GPS
Tolman (1930s and 1940s)
Rats created a cognitive map when exploring a maze
Cognitive map: a mental map of the spatial layout of an environment
They did not just learn turn right for food
Place Cells
O’Keefe (1970s): discovered place cells in the hippocampus
Place cell: fires when an animal is in a certain place in the environment
Only in the physical place, not just looking at it
Grid Cells
Moser & Moser (2000s): discovered grid cells in the entorhinal cortex
Grid cell: fires when an animal is in a certain place in the environment and has multiple place fields arranged in regular, grid-like patterns (hexagon) - creates a cognitive map
Fires at regular intervals as an animal navigates an open area
Allowing the animal to understand its position in space by storing and integrating information about location, distance, and direction
Forms a unique pattern of coordinates, which is shifted with respect to the coordinates formed by other nearby grid cells
The whole environment is “filled” with grid patterns
Each grid cell is active in multiple locations
Parietal Reach Region (PRR)
In the parietal cortex, it is involved in reaching for objects (contains neurons for control of reaching and grasping)
Mirror Neurons
A class of neurons that modulate their activity both when an individual executes a specific motor action and when they observe the same or similar action performed by another individual
Responds when an animal grasps an object and when viewing someone grasp an object
PPR responds to the observed action, “mirrors” the response of actually grasping
Has a diminished response if grasped by a tool, indicating it is specialized to the type of motion, not the pattern
Audio Visual Mirror Neurons
Respond to action and the accompanying sound (in premotor cortex)
Kohler et al. (2002):
Hearing or watching a peanut being broken caused brain activity that is associated with the action
Audiovisual neurons respond to “what is happening”, not just the pattern of movement
Mirror Neurons and Intentions
Iacoboni et al. (2005)
Hypothesis: If mirror neurons are only influenced by actions, then the control action and intention action should have the same response in the brain
Found more activation in the mirror neuron networks when there was intention
Mirror neurons can be influenced by different intentions
Mirror neurons encode the “why”
Possible Functions of Mirror Neurons
To help understand another animal’s actions (intentions) and react to them appropriately
To help imitate the observed action
May help link sensory perceptions with motor actions
Implicated in communication, empathy, etc.