1/96
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Navigating a 3D World
Perceive some objects as farther away than others, but all project a 2D image on the retina → cannot judge in terms of points on the retina being stimulated
Navigating a 3D world is hard: inverted, distorted image stimulates retina
-Eyes have different views of the world
-Placement of eyes depends on being hunter or prey
-Same object projects onto different locations of retinae (binocular disparity)
Monocular Depth Cues
Single eye depth cues used to infer 3D from 2D retinal image
Pictorial depth cues:
-Occlusion: only wrong given accidental viewpoints, nonmetrical (doesn’t tell whether object is small triangle or distant mountain)
-Relative size: image on retina gets smaller when further away
-Texture gradient: changes in size smoothly across image suggest a plane tilted in depth
-Relative height: for objects on ground plane, more distant objects project higher in visual field
-Familiar size: hand close / far
Non Metrical Information
Relative size and relative height clues together give some metrical info (G farther from B than B from R)
Visual system knows about atmospheric scattering, haze aka aerial perspective: short wavelengths scatter more
Can also use linear perspective: renaissance paintings
We can compensate for distortion pretty well based on context
Triangulation Cues to 3D Space
View image from different vantage points: triangulation cues, can be monocular or binocular
Motion parallax: closer something is to you the faster it moves by, projects larger on the retina, depth cue based on head mvmt
Optic flow: objects that get bigger are approaching
Focus Cues Aiding in Depth Perception
To focus at different distances: accomodation but also convergence and divergence
Disparity & Horopter
Has to be converted into stereopsis (seeing in depth)
Bob looks at red crayon (foveatic), blue and red crayon fall on corresponding points on the retina (as do all stimuli lying on Muller circle)
Objects with corr retinal points have zero binocular disparity: when both eyes look at same spot, surface of zero disparity runs through that spot → horopter (where you see no difference)
Objects falling on the horopter are perceived as one object (Panum’s fusional area)
Outside of the horopter, nonzero disparity → double vision
Distance from Horopter
Larger retinal disparity, further distance from horopter
Whether object is in front of or behind horopter is indicated by a sign: crossed (in front of) - L of fovea on L eye, R fovea of R eye, or uncrossed (behind) - R of fovea L eye, L of fovea R eye, perceive object L in L eye and to R in R eye
Can disparity cause stereopsis?
To demo, disparity info needs to be isolated from other depth cues
Random dot stereograms: section of dots from L display is shifted one unit to the right
If you present each image to one eye only you can see a floating square, need stereoscope
Correspondence Problem
Which part of image in L eye goes with image in R eye?
Match up foveatic projections first, then match left and right projections
Harder for complicated stimuli": we may use low spatial freq data as a starting point, visual system adds in higher frequency data
2 heuristics simplify this problem:
-uniqueness constraint: feature in world is represented once in each retinal image (ex match nose in L & R retinal images)
-continuity constraint: except for edges, neighbouring points in world lie at similar distances from viewer
Physiological Basis of Stereopsis
Requirement: neuron that receives input from both eyes → doesn’t exist until V1 (binocular neurons)
Neurons with receptive fields for each eye, share nearly identical orientation, spatial-freq tuning, preferred speed, and direction of mvmt → well suited to matching images in both eyes
Many binocular neurons respond best to images occupying same location on retina (horopter)
Physiological Basis of Stereopsis 2
Other binocular neurons fire most strongly to objects occupying slightly different locations on the retina → neural basis for disparity
Red neuron prefers stimulus to right of fixation point, blue neuron prefers stim to left / behind fixation
V2 and higher cortical areas have neurons that respond to near zero, crossed, and uncrossed disparity
Nonmetrical stereopsis: feature is in front / behind fixation
But a more precise metrical stereopsis also used by the where pathway → accurate localization
Depth Cues Studied
In monkeys, Middle temporal (MT) area signal sign of depth (near / far) solely based on motion parallax
V2 may compute depth order (what’s in front of what)
V4 encode depth intervals based on disparities, IT cortex represents complex 3D shapes
Combining Depth Cues
As with object recog, depth percep is combining and weighing guesses
Helmholtz: unconsc interference
Use Bayesian approach: prior knowledge influences estimates of probability of current situation
ex pennies → infinite hypotheses, know all pennies are same size, C is through accidental viewpoint only so one option is more probable
Size Constancy
Palm trees farther away take up less of field of view, but aren’t perceived as mini
Size constancy: perception of object’s size relatively constant when viewed from different distances, aids in perceiving depth (smaller trees = farther)
Size constancy and depth perception are related
Size Distance Scaling
Size constancy is based on this
S = K(R x D)
S: object’s perceived size, K: constant, R: size of retinal image, D: perceived distance of object
Person walks away retinal image gets smaller but distance gets larger, balance each other out so person’s size perceived constant → crucial for everyday perception
Binocular Rivalry and Suppression
Objects in real world often project onto different areas of retina (noncorresponding retinal points)
Visual system physiologically prepared to deal with those discrepancies via disparity tuned neurons in V1 and above
But different images presented to both eyes: binocular rivalry
In this case the visual system suppresses one image, typically you see the most interesting one with the most salient features → high > low contrast, bright > dim, moving > stationary
Binocular Vision Gone Wrong
Hubel & Wiesel: visual cortex has critical period → normal binocular visual stimulation is required for normal cortical development
Strabismus: misalignment of 2 eyes
→ Esotropia: one eye points inwards
→ Exotropia: one eye points outwards
Result: single object foveatic in one eye but not other
Those who suffer from strabismus in first 18 mo of life don’t show interocular transfer (~18 mo critical period in humans) → early onset strabismus reduces # of binocular neurons in visual cortex
Tmt: surgery
Depth Perception Across Species
Praying mantis: large compound eyes made up of ~10,000 ommatidia each
Tiny anaglyphic glasses w/ blue & green lens, glued to forehead with beeswax and resin, place it in hunting position in front of screen
2D movies of bugs played: did nothing
3D bug movies w/ bug positions varying in distance through binocular disparity: mantises struck to grab dinner when bugs were at an appropriate distance
Conclusion: mantises have stereoscopic vision and respond to depth created by disparity
Attention
Cannot process all incoming visual info
Attention: large set of selective processes in the brain
Varieties: internal vs external, overt vs covert, focused vs divided, sustained vs selective
Selection in Space
What does it mean to attend to a stimulus?
Posner’s spatial cuing task: dep var is RT, coloured box is exogenous cue (automatically draw attention due to salience)
Endogenous cue (instructions you choose to obey, red = right, green = left)
Both cue types can be valid and invalid
How long does it take for cue to redirect attention?
Vary stimulus onset asynchrony (SOA)
SOA = 0: cue/probe are simultaneous, no time for cue to direct attention
SOA = 150 ms cuing effect from valid exogenous (peripheral) cue increases
Endogenous cue takes longer: require processing to interpret them
Inhibition of return: if you’ve already looked somewhere and moved eyes it is harder to return attention to that location
How does attention move from one spot to another?
Spotlight of attention & alt theories like zoom lens
Cuing experiments: artificial
→ visual search: find target among distractors
Feature (pops out) vs conjunction searches
Can vary in difficulty (feature are easier) and in set size (bigger set = harder): consider RTs
Feature Search
Unique colour or orientation, provided sufficient saliency of target it will pop out
Can process colour/ orientation of all items at once
Parallel search: RT doesn’t scale with set size, several target attributes support parallel search → colour, shape, lighting
Inefficient Searching
Targets/distractors share features, search is inefficient
Requires serial self terminating search: serially search all features until target is found
→ sometimes you find it immediately, sometimes it’s the last item scanned, on average you scan 50% of items so some searches are more difficult than others
In real life a feature search is rare
Real World Searches
Guided search: attention restricted to subset of items based on info on target’s basic features ex find tomatoes (conjunction of features, red, round, large)
Efficiency: feature > conjuction > serial
Search for arbitrary items (faucet) isn’t efficient
Real world: scene based guidance, use understanding of scene to find faucet
The Binding Problem
Selective attention allows us to solve
Task: report #s, then objects at each of the 4 spots
PP’s ability to focus on shapes was reduced (divided attention)
~20% trials PPs report illusory conjunctions, put together shapes and colours wrong → feature integration theory
Preattentive stage: basic features (colour, orientation) available, but don’t know how they’re bound together until we pay attention to them
Found illusory conjunctions with stimuli until PPs told they were watching carrot, lake, etc (assigned meaning)
Attentional Blink
Attention in time
T2 follows T1 within 200-500 ms and T1 is correctly reported, T2 is typically missed
Somehow, once attention captured by T1, there is a temporary inhibition
→ locus ceruleus cannot fire again
Physiological Basis of Attention
(V1) When attending to part of the visual field, neurons representing that part of the field become more active
Further into cortex (extrastriate regions): more pronounced effects
Earlier effects may actually be due to feedback from later stages in the process
Can also focus attention on basis of stimulus properties as opposed to location: act of selection changes what you perceive
Attentional selection can recruit specialized brain areas ex FFA PPA
→ faces on houses
Attention and Single Cells - 3 Ways
Attention could change the activity of a neuron in 3 different ways:
1. Enhancement: cell becomes more responsive
2. Sharper tuning: cell detects signal more easily
3. Altered tuning: neuron radically changes its’ preference eg change size and shape of receptive field of neuron when attending to one or another location
Attention & Single Cells
Cells restrict their processing to object attention is focused on → resources drawn from neighbouring cells, inhibition surrounds object of attention
3 hypotheses not mutually exclusive: enhancement of one neuron may be signal for other neuron to increase tuning
Attention may serve to increase synchronicity between brain areas, or it may decrease it within brain areas
Desynchronized neurons represent facets of a stimulus: face → angry → forgot bday
Disorders of Visual Attention
Common: brain damage leading to visual field defect, eg damage to R V1 will be blind in left visual field
Damage to parietal lobe (dorsal/where pathway) : problems attending to objects in the left visual field
→ neglect or extinction
Neglect
Behave as if part of the world contralateral to lesion doesn’t exist
Lab test: line cancellation or copying pictures
Neglect may be related to stimulus, not to the world → rotated barbell
Pts could pay attention to contralesional field
Extinction
Related to/milder neglect
Neurologist shows pt fork in ipsilesional field: can describe it; spoon in contralesional field: also described
Fork and spoon held up together: spoon ignored
Neglect patients can’t focus attention on contralesional field at all, extinction pts can, provided the object is salient and there’s no competition
Perceiving & Understanding Scenes
Visual experience: 2 Pathways
Early visual stages shared by both paths, spotlight of attention associated with selective pathway
Specific locations/objects selected for processing that allows for binding and object recognition: selective path
Visual world outside spotlight of attention is processed by nonselective pathway: allow you to know overall ‘picture’ before specifics
Ensemble statistics: knowledge of properties of group → features not yet bound, requires scrutiny to figure out if strongest tilted fish are also smallest
Such conjunctions require selective pathway
Perceiving & Understanding Scenes 2
Don’t need much time to process scenes: PPs can move eyes to picture containing animal in just 120 ms
PPs can distinguish natural/urban scenes in 19 ms
Spatial layout (description of structure of scene, ex navigable/non navigable can be extracted in 20-50ms)
Too short to string together objects in picture, so what then? → several dimensions can be used to quickly break down a scene ex openness, roughness, naturalness
Measure local spatial frequencies: pattern emerges (low openness, low expansion in buildings & high openness/expansion in highways
Scenes with same gist (ie sine waves) cluster together
Memory for Objects and Scenes
Can be good, observers presented with 612 pictures correctly id’d 98% as old/new, 90% correct after a week
Or bad: study picture, disappears for 80ms, then similar picture presented
→pictures kept flipping back and forth, blank screen in b/ stimuli until PP found differences
Results: PPs required several seconds to find changes: some weren’t able to find all changes in given time
Memory for Objects and Scenes - Change Blindness
Without a blank screen, differences are easier to spot
If changes are made while PP’s eye moves, fail to notice change (visual input is suppressed during saccades)
Controversy: how can we remember 1000s of objects in seconds but not notice large changes in pictures right in front of us?
→ maybe because person who’s given instructions doesn’t matter (same gist of scene)
Gist must be more than just verbal descriptions, can’t be put into words
What do we actually see?
Ensemble statistics
When rough representation gets combined with stream of objects recognized by selective pathway: inference of coherent world
But we don’t perceive objects floating in sea of ensemble statistics, remember: only fovea offers acuity, both retinae have slightly different copies of a scene
→ our conscious experience of the world is a massive inference
Our world is much more stable than a lab world: put your coffee down and it usually remains where it is
Feels counterintuitive → change blindness
Can monitor, ~20 objects/sec
Lab paradigms underlie real phenomena → eyewitness testimony, radiologists
World in Motion
World is in motion constantly - can be stationary or moving and see motion or moving and perceive something as stationary
Many V1 cells respond to motion in one direction
→waterfall illusion: motion aftereffect (MAE) - observe motion in one direction, then perceive stationary objects to move in opposite direction
A motion based opponent process
Motion Aftereffects
Neurons tuned to different directions don’t typically respond to stationary objects, so they continue to fire at a spontaneous rate
→ spont rates are balanced for upward and downward movement, signals cancel out and no motion gets perceived
Staring at a waterfall adapts downward neurons, looking at a stationary object upward neurons fire faster than downward ones → you perceive upwards MAE
MAEs and Interocular Transfer
MAEs show interocular transfer, NOT in LGN
Conclusion: MAE must reflect activity in part of visual system where information from both eyes is combined
Middle temporal area (MT = V5)
What would make for effective motion detector?
2 adjacent receptors, A/B: moving object passes through their receptive fields, 3rd cell (M) listening in could detect motion
But M can’t just add up excitation from A/B: 2 bugs would set off M and cause motion perception
Solution = Cell D: receives input from A, briefly delays that input, adapts quickly (fires when light hits A, stops firing when light continues to stimulate A)
B & D connected to X, multiplication neuron fires only when B AND D are active
So D delays A’s response, X multiplies delayed response by B
Reichardt Detectors
For each B/D pair, one X neuron: Reichardt detectors
Within reichardt circuit, mechanism is motion sensitive (detects L-R but not R-L) and speed sensitive
Issue: correspondence problem ex with movies: how does visual system know which features in frame 1 match up to features in frame 2?
Ex of correspondence problem: aperture problem = individual neurons: each V1 cell sees the world through small aperture of receptive field
Computation of Visual Motion - correspondence problem solution
Take ends of objects into account
Striate cortex has neurons that respond to the end of objects, so they could signal side and upwards motion
Combine responses from multiple neurons
Only one direction is consistent with motion being perceived, V1 cell has access to all this info & can find common denominator
Where would global motion detectors be in the brain?
Lrsions to magnocellular layers of LGN impair perception of large, rapidly moving objects
Info from M neurons feeds into V1, then into middle temporal area of cortex and medial superior temporal area (MST)
MT (hMT+/V5 in humans)/MST are motion hubs
→ MT cells are selective to motion in specific direction (plus a bit of colour/form): are they the integrating cell? → corr dot
Correlated Dot Motion Displays
As in aperture problem, single dot isn’t enough to determine overall direction of correlated motion
So neuron must integrate info from many local motion detectors
Trained monkeys could recognize correlated motion with just 2-3% of dots moving in the same direction
After MT lesion, monkeys needed 10x that coherence, but could id orientation of stationary objects
Electrically stimulate MT areas that respond to motion in a certain direction: then dots are perceived to move in that direction, even if they moved in the opposite direction
1st & 2nd Order Motion
1st: change in position of luminance defined objects
2nd: change in position of texture defined objects
2nd Order Motion
Apparent motion based on texture or contrast, visual system includes mechanisms for this order of motion
Pts with brain damage who have 1st order but not 2nd order motion detection or VV → double dissociation
2nd order MAEs transfer more strongly b/ eyes than 1st order MAEs
Evolution of 2nd order mech: see through camoflague
Motion Induced Blindness
When carefully fixating on target, stationary targets in periphery disappear: Troxler effect
→ maybe because retinal image gets stabilized so that invol eye mvmts occurring during fixation don’t move target onto new receptive fields: target doesn’t change, neurons become adapted
Using Motion Information
How can we use motion info to interpret the world?
Optic array: collection of light rays that interact with objects in front of viewer
As we move, patterns of optic flow → changing angular positions of points in a perspective image as we move
Focus of expansion (FOE) : one stationary point used to determine heading
Time to Collision Rate
TTC = distance / rate
People are bad at judging distance but accurate at estimating TTC
Lee → people also use tau: object getting closer = larger retinal image; tau is ratio of retinal image size to rate at which image expands, and TTC is proportional to tau
Specific neurons in pigeons and locusts respond to objects on a collision course
Using Motion to ID objects
Biological motion: something special about pattern of movement of humans and animals, helps us to id moving object and its’ actions
Studied with point light walkers, PPs can say whether male or female walker → possibly body sway or shoulder width
Eye Movements
Eyes move constantly, then image on retina moves too, so how does the brain know which mvmts are the object and which are eye/head mvmts?
6 extraocular muscles control eye mvmt
Stimulating superior colliculus in monkeys: generates eye mvmt in specific direction, diff cells = diff mvmts
Continuous microsaccades, if they are absent the periphery fades and disappears
Physiology of Eye Mvmts - Voluntary
3 types of vol eye mvmts
1. Smooth pursuit: smoothly following object
2. Vergence: eyes moving in opposite directions
→ convergence (inward)
→ divergence (outward)
3. Saccade: fast jump, shifts gaze from one spot to another, 3-4/sec, ~172,800 /day
→ made voluntarily to interesting features
Physiology of Eye Mvmts - Reflexive
Vestibular eye mvmts
Optokinetic nystagmus: reflexive eye mvmt where eyes invol track continually moving object smoothly, then snap back
No info processing during saccades
Saccadic Suppression and the Comparator
How does your brain know whether your eyes move or an object moves across the retina?
→saccadic suppression: when we make a saccade, the visual system suppresses input from magnocellular pathways
Useful: eliminates moving world looking like smear, but no saccadic suppression during smooth pursuit
So how can object moving across retina seem stationary? Motor system sends out 2 copies of each order to move the eyes
-one to extraocular muscles
-one to comparator in visual system (corollary discharge signal, aka efference copy)
Comparator can compensate for image changes caused by eye mvmt
The Comparator
Comparator can compensate for image changes caused by eye mvmt
Thus inhibiting attempts by other parts of visual system to interpret changes as object motion
3 signals
1. Image displacement signal (IDS): when image moves across retinal receptors
2. Motor Signal (MS): signal sent from brain to eye muscles
3. Corollary Discharge Signal (CDS): copy of motor signal but sent to comparator in brain
3 Situations
1. Maria eyes stationary, J walks → IDS
2. Maria eyes move, J walks → MS & CDS
3. Maria scans room → IDS & CDS + MS
When either IDS or CDS is sent to brain, motion is perceived
When both IDS/CDS sent to brain: no motion
IDS or CDS hits comparator: tells brain mvmt
IDS & CDS hit comparator at same time: cancellation
Simulate IDS: gently jiggle eye → moving world (no MS or CDS)
Physiological Evidence for Corollary Discharge Theory
Real motion neurons that only fire to actual mvmt (IDS) not to eye mvmt (CDS)
Where is comparator? → maybe superior colliculus, active field of research
Attention probably plays an important role
What is sound?
Sound is created when objects vibrate
Speed of source ~340 m/s in air, ~1500m/s in water, whereas light is ~300,000,000m/s
Qualities of sound waves:
→ freq = pitch (Hz) high or low
→ amplitude = loudness (dB)
→purity = timbre
Healthy young humans hear ~20-20,000 Hz, can hear range of amplitudes: intensity ratio between faintest/loudest sound exceeds 1:1 million
Sine Waves
Sine wave is called a pure tone, in real world sounds are combos of sound waves
Complex sounds are in spectrums: displays how much energy (amplitude) is present at different frequencies
Most sounds have harmonic spectra: each frequency component is called a harmonic
Harmonics
1st harmonic: fundamental frequency
Subsequent harmonics are integer multiples of the fundamental freq
3 instruments can produce a middle C note, but each instrument’s sound produces a unique spectral shape, creating that instrument’s timbre
Basic Structure: Outer Ear
Pinna/auricle + ear canal
Pinna collects sound waves, channels into ear canal ~25mm
Length and shape enhance frequencies in 2-6kHz range (human speech)
Ear canal protects tympanic membrane at end from damage
Basic Structure: Middle Ear
Tympanic membrane is border b/ outer and middle ear, connected to the tympanic membrane are the ossicles (malleus, incus, stapes)
Stapes connect to the oval window (membrane which is the border b/ middle and inner ear)
Ossicles amplify sound vibration by hinges (1/3 increase in pressure) and concentrating energy from larger onto smaller area (18x increase)
Amplification is crucial because inner ear filled with liquid, takes more energy to move than air
Tensor tympani and stapedius muscles can stiffen to dampen loud sounds (reactive: not good with gun shots but good @ concert)
Basic Structure: Inner Ear
Canals, Reissner’s and basilar membranes, helicotrema, cochlear partition, round window
Bulging of vestibular canal puts pressure on middle canal, which displaces basilar membrane, pushing down when vestibular canal bulge is created
Round window bulges out, affects organ of corti
Organ of Corti
3 rows of 10,500 Outer Hair Cells (OHCs)
1 row 3,500 Inner hair cells (IHCs)
Over corti is the tectorial membrane
Basilar membrane moves → tectorial membrane shears → pulls on stereocilia → if they bend toward the tallest, tip links pull on K+ channels and they open → K+ rushes in → depolarization → vesicles fuse → release neurots
If the stereocilia bend the other way channels close
Hair cells bathed in endolymph (high K+ low Na+) →endolymph in middle canal, perilymph (high Na+) in vestibular and tympani
How does cochlea code amp & freq?
Amp: higher amp = more forceful vibration of tympanic membrane, more mvmt of cochlear partition, more HC displacement, more neurot released
Freq: base / apex are different thicknesses
Travelling wave takes time, so high freq at peak earlier (base - stiffer, need higher freq) are stimulated earlier than low freq (apex, floppy)
Coding Amplitude & Frequency
Cochlea sharpens tuning, > 90% afferent fibres (into brain) of auditory nerve synapse onto IHCs
What about OHCs?
Most have efferent fibres, determine what kind of info sent to brain
AN fibres lined up along length of basilar membrane, allows for threshold tuning curves of ind fibres
Characteristic Freq: freq that increases IHC’s firing rate at lowest intensity (bottom of tuning curve)
When OHCs not active → IHC loses sharpness of tuning, OHCs can change shape and stiffen basilar mem locally (respond less to certain frequency →tunes)
eg low amp pure tone: brain needs to figure out which fibres firing (character freq)
2 tone suppression: when 2nd tone in comparable freq added, AN fibre reduces firing rate
Are AN fibres also selective for frequencies well above barely noticeable amplitude? eg amp of speech
Isointensity curves: map AN fibre’s response to range of frequencies presented at same amplitude
Neuron’s tuning curve widens at higher amplitudes
Due to limitations in hair cell displacement (determines maximal firing rate ; rate saturation)
Solution: low (like cones, require high intensity sound) mid, and high spontaneous (like rods, low intensity) fibres
AN fibres and Temporal Info
Phase locking
Temporal code: 100 AP/sec = 100 Hz sound
Due to refractory period, single neurons can’t do this at frequencies > 1000 Hz → but assemblies of neurons can (Volley principle)
Different neurons lock to different phases and they all cover different parts to interpret
Complex sound w/ 8kHz & 200 Hz combined sound waves: neuron at base of basilar mem (high freq) gets excited (8kHz) and phase locks to 200 Hz ; brain gets 8 kHz component (place coding) & 200 Hz component (temporal coding) time of wave
Auditory Brain Structures
CN VIII: vestibulocochlear nerve, connected to IHC
→Cochlear nucleus (in brainstem): specialized neurons eg sensitive to onset of sound @ particular freq, sharpen tuning by suppressing nearby freqs thru lateral inhib
→Some cochlear nucleus neurons decussate (cross to other side) and project to superior olive (audio info from both ears reaches both sides of brain much earlier than in vision)
→ then inferior colliculi mostly receive contralateral input
→ medial geniculate nucleus of thalamus has more projections from cortex than to (two way street)
Auditory Brain Structures 2 - after MGN of thal
All these structures have tonotopic organization (fan of high freq to low freq)
First cortical point: primary auditory cortex A1, less strong tonotopic organization
A1 neurons project to belt area, then to parabelt area
A1 responds to almost all sounds, the (para)belt areas are more selective
Audio processing goes simple → complex just like vision
How do people psychologically perceive sounds?
Humans are not electronic devices: psychoacousticians use this to study characteristics of human audition
eg 2 waves of same amp may not be perceived to have same loudness
Audibility threshold: lowest sound pressure level that can be detected at a given freq
→lowest audibility thresholds in 2-6kHz range
Equal loudness curves: ex 200 Hz @ 70 dB sounds as loud as 900 Hz @ 60 dB (freq & amp both affect perception)
Loudness and Duration
Loudness also depends on duration: longer sound is perceived as louder
Due to temporal integration: summation of energy over brief period of time
Humans are sensitive enough to pick up 1 dB differences in loudness
Possible due to sensitivity of ind AN fibres: some sensitive to 0-25 dB, others 15-40, etc so full assembly can encode wide range of intensities
What about freq / pitch?
Pitch perception is important, as suggested by tonotopic organization along auditory system
Bigger pitch increase perceived for low freq sounds eg 500 → 1000 Hz perceived as bigger pitch increase than 5000 → 5500 Hz, perceive lower pitch increase
Test this with masking: using 2nd sound, usually white noise to complicate detection of sound
White Noise
Signal with equal energy in every freq of human auditory range ~20-20,000 Hz
In exp: embed 2kHz sine wave into noise band of 1,975-2,025 Hz
Adjust intensity of test tone until it can just be heard over noise, then increase noise bandwidth (1,950 - 2,050 Hz) → listener must increase intensity to hear tone over noise
At some point, noise bandwidth is such that test tone cannot be detected: critical bandwidth (no matter how loud noise is, cannot be heard over white noise)
Critical Bandwidth Conclusions
Width of critical bandwidth depends on tone freq, widths correspond to physical spacing of freq on basilar membrane
Greater portion basilar mem vibrates to low freq, lower freq critical bandwidths are smaller than higher freq
Asymmetrical masking findings: masking sounds at freqs lower than test tone are more effective
Basilar Membrane
Different freqs have diff peaks on wave at diff locations of basilar membrane
Where fibres are stimulated brain uses to figure out the freq
Mainly at base = high freq
Mainly at apex = low freq
Interaural Time Difference
Small differences in timing of sounds arriving in the ears due to the head blocking sound waves
Sound locations are on azimuth → biggest difference 640 microsec, smallest = 0
Listeners can detect ITDs of 10 microsec for 1000Hz tones
MSO & ITDs
Binaural input required to calculate ITD: medial superior olives are first binaural input locations
MSO neurons increase firing rate to short ITDs
How?
Where signals meet in MSOs is where signal is coming from
Wave travels down basilar membrane: will reach farther (lower freq toward apex) sooner in ear on side of sound source
So tiny time differences between ears = wave at different places
Tiny differences in freq used to determine source
Interaural Level Differences
Sounds more intense to ear closer to sound source, largest at 90 deg
Head blocks high freq sounds more than low freq (they just go around the head)
So ILDs increase for high freq sounds, bad for low freq (high freqs blocked most by head so creates large ILDs)
Lateral Superior Olives (LSO): sensitive to intensity differences
Excitatory input from ipsilateral ear, inhibitory input from contralateral ear through medial nucleus of trapezoid body (MNTB): competing signals, helps focus on one specific sound at one time
Cone of Confusion
Different freq have diff in intensity when coming from same point on azimuth, similar to what happens with elevation differences (ILD won’t help)
Given elevation of sound source: ITD/ILD could arise from any point on surface of cone of confusion
Solution: move your head, visually locate sound source, shape of pinnae: funnel some freq better than others, upper torso shape
Due to all this, intensity of freq varies slightly according to direction of sound
Smoothing Folds of Pinnae
Smoothing with molding compound reduces ability to localize sound elevation
Initially, actual sound elevation detection suffered but azimuth location intact
Binaural cues: azimuth, spectral cues for elevation
Removal of mold: immediate high accuracy localization
Maybe diff sets of neurons respond to diff sets of cues
What about distance percep?
Use relative intensity of sound (cartoon bomb sound), but hard → soft birdsong may be far away or close but muffled
Sound intensity drops as quickly as distance increases → inverse square law - 6dB difference = 2x sound pressure
Effect is particularly pronounced close to listener, hence underestimation of faraway sound sources
Source or you moving aids in distance perception
Spectral Composition of Sounds
Qualities of air dampen high freqs more than low freqs, so farther away sounds have fewer high freqs (thunder crack vs boom)
Relative amounts of direct vs reverberant energy, closer source = more direct energy
Some blind people can learn to use echolocation, which recruits visual regions in their brains
Complex Sounds
Sine wave tones useful but not realistic
Most common sounds have harmonic spectrum
Lowest freq of harmonic spectrum is fund freq, and further harmonies are integer multiples of fund
ex greatest freq @ 250 Hz, next 500 Hz, next 750 Hz until no energy left
Pitch
Perceived pitch of complex sounds determined by fundamental, harmonics add to perceived richness of sound
Auditory system is sensitive to relationships between harmonics: when fund is removed from sound and only harmonics remain, listener still hears pitch of fund freq: missing fundamental effect
Due to temporal coding eg 500Hz tone peaks every 2 ms, 750/1000 Hz tone peaks 1.3/1 ms, peaks align every 4 ms which is fund freq period of 250 Hz
Some neurons in auditory nerve / cochlear nucleus will fire every 4 ms
Timbre
Psych sensation by which listener can judge that 2 sounds of = loudness and pitch are dissimilar ex piano and banjo play same note
Related to relative energies of different acoustic spectral components, just like colour vision (lvls of energy @ diff wavelengths)
Timbre related to harmonic structure of tone
Attack & decay are also important
Finally, there are non repeated aperiodic sounds like door slamming shut which can also be decomposed into freqs
Auditory Scene Analysis
Many sound sources in real world, often need to focus on one
Sound waves from the environment are all summed into a single complex sound wave, not like vision where we can focus on smth
Audio syst uses:
1. Spatial Segregation: sounds coming from same spatial location likely coming from same source
2. Spectral Seg: sounds w/ similar pitch are from same source
3. Temporal Seg: same time
Sounds perceived to be coming from same source are part of same auditory stream
Auditory Stream Segregation
2 tones w/ similar freq alternated, perceived as warbling up and down of tones
But if they differ strongly in freq, 2 diff streams are heard, 1 high 1 lower pitch
Gestalt principles →similarity
Adding tones gets perceived in diff ways:
pops out
same timbre in succession creates streams
different timbres creates segregated streams
Because instruments have diff timbres, ind melodies easy to seg
Neural audio stream seg streams at all lvls, simple early on, more complex probably cortical
Temporal Segregation
Sound components beginning at nearly same time likely to come from same source, helps group harmonics into one complex sound
Start to play notes at diff times, ind instruments stand out
Gestalt → common fate: sounds grouping together increases if they begin/end at same time
Hearing is Very Fast
When careful timing is required, rely on hearing more than vision
Ex When judging how long a flash of light seen, PPs report longer times when longer sound played
Single flash of light presented to periphery perceived as multiple if mult sounds presented
Experience Shapes Perception
Ex hearing your name
PPs better at identifying and seg sounds that were repeated vs single instances
People have trouble recognizing warped melody → can’t access melody schema until told what it is
Continuity & Restoration
Sounds aren’t always continuous, but if you pay attention you can hear through disruption
Gestalt → good continuation
Delete part of pure tone & replace w/ white noise, tone sounds continuous if noise intense enough to have masked tone
When white noise masks pure tone of sliding freq (sliding whistle) PPs report glude
However, when noise is shortest/most intense, d’=0 → PPs lost discriminative ability, didn’t know whether glide or not
Can hear through if you recognize that white noise is loud enough to block it
Perceptual Continuity
We maintain perceptual continuity for quite long when listening to noise that would actually continue for longer time (ex applause)
fMRI research suggests A1 activity consistent with PPs claiming to hear/not hear sound through noise
Suggests that restored missing sounds actually encoded in brain as if present in signal, which is why they sound real
Restoration of Complex Sounds
Can also be restored, PPs reported hearing missing notes blocked by white noise: syst so effective they couldn’t tell which noise blocked, PPs missed notes more often in unfamiliar melodies
European starlings: did not peck when note in starling song was blocked by white noise (they restored it) but pecked when note blocked by silence
Continuity Research
Measure with electrodes from brain of pt waiting for surgery, have them listen to words like novel and nozzle, block middle sound and get them to report which they heard → researchers could predict which one they would report by looking at brain activity
Auditory Attention
Has parallels with visual attention
Auditory system guides us in certain situations ex darkness
Acoustic startle reflex: first muscle twitches 10 ms after sound onset, threshold lowers when frightened
Can be selective: controls & musicians missed guitar solo in classical music when told to count typani beats → inattentional deafness