Hearing: Physiology and Psychoacoustics
9.1 The Function of Hearing
Function of hearing: helps you stay aware of your surroundings and identify & recognize objects in the world based on the sounds they produce.
Emphasizes auditory scene analysis: extracting meaningful information from acoustic environments.
9.2 What Is Sound?
Sound is created when objects vibrate. Object vibrations cause surrounding medium molecules to vibrate, producing pressure changes in the medium.
Sound waves travel at a speed that depends on the medium:
In air: v_{ ext{air}} \,\approx\, 340\ \text{m/s}
In water: v_{ ext{water}} \,\approx\, 1500\ \text{m/s}
Physical qualities of sound waves:
Amplitude or Intensity: magnitude of displacement of the pressure wave; perceived as loudness.
Frequency: number of pressure-change cycles per second; perceived as pitch.
Psychological qualities:
Loudness: related to perceived intensity (amplitude).
Pitch: related to perceived frequency.
Timbre: the qualitative difference between two sounds with the same loudness and pitch.
Decibels (dB): unit of measure for physical sound intensity (sound pressure level).
Reference pressure: p_0 = 20\ \mu\text{Pa} in air, defined as 0 dB.
The relation: Lp = 20 \log{10}\left(\frac{p}{p_0}\right) where (p) is the sound pressure.
Example relationships:
A 10:1 pressure ratio corresponds to 20\ \text{dB}.
A 100:1 ratio corresponds to 40\ \text{dB}.
We commonly perceive a 10-fold increase in acoustic power as roughly double the loudness (note: loudness perception is not perfectly linear).
A doubling of pressure yields an approximate increase of \Delta Lp = 20 \log{10}(2) \approx 6.02\ \text{dB}.
Range and limits:
Human hearing range: typically from f \in [20,\ 20{,}000]\ \text{Hz}.
The audible dynamic range for humans spans roughly 1:1{,}000{,}000 in intensity (
about 120–140 dB depending on conditions).
Frequency units:
Hertz (Hz): unit of frequency; 1 Hz = 1 cycle per second.
Pure tones vs. complex sounds:
Sine wave: a pure tone where the waveform is a sine function in time.
Most real-world sounds are complex and can be described as a combination of sine waves (Fourier analysis):
Complex sound x(t) can be expressed as
x(t) = \sum{k} Ak \sin\left(2\pi fk t + \phik\right)
Spectral representations:
Spectrum: energy distribution across frequencies (magnitude vs. frequency).
Harmonic spectrum: energy at integer multiples of the fundamental frequency; fundamental frequency is the lowest frequency component.
Fourier analysis: decomposes a complex function into sine/cosine components.
Spectrogram vs. waveform vs. spectrum:
Waveform: time vs. amplitude.
Spectrogram: time vs. frequency with color/intensity indicating energy.
Spectrum: frequency vs. energy (often at a fixed time).
Simple vs. complex sounds:
Sine waves are rare in everyday sounds; most sounds are a mix of multiple frequencies.
Quick review prompts (from slides): consider examples like leaves rustling, library, heavy truck, jet takeoff to infer amplitude and frequency characteristics.
9.3 Basic Structure of the Mammalian Auditory System
Overview: sounds travel from outer ear → middle ear → inner ear → neural signals to brain; key anatomical structures include pinna, ear canal, eardrum, ossicles, cochlea, hair cells, auditory nerve.
Outer ear:
Pinna collects sounds and funnels them into the ear canal.
Length/shape of the ear canal enhances certain frequencies and helps insulate/protect the tympanic membrane.
Middle ear:
Ossicles: malleus (hammer) → incus (anvil) → stapes (stirrup).
The ossicles amplify and transfer energy to the cochlea via lever action and concentration of energy from the tympanic membrane to the smaller oval window.
The oval window is the boundary between the middle and inner ear; movement transmits pressure into the vestibular canal.
Acoustic reflex: muscles tense in response to loud sounds, reducing pressure changes.
Eardrum (tympanic membrane): vibrates in response to sound, driving the ossicles.
Inner ear (cochlea): transduction of fine pressure changes into neural signals.
Cochlear canals and membranes:
Three parallel canals in the cochlea filled with different fluids:
Vestibular canal (scala vestibuli) and tympanic canal (scala tympani) are filled with perilymph.
Middle canal (scala media) is filled with endolymph and houses the cochlear partition.
The cochlea is a spiral structure containing the organ of Corti.
Stria vascularis in the scala media maintains ionic balance in endolymph.
The three cochlear canals are separated by membranes:
Reissner’s membrane: separates vestibular and middle canals.
Basilar membrane: base of the cochlear partition; separates middle and tympanic canals.
Fluid compartments:
Perilymph in vestibular and tympanic canals; Endolymph in the scala media; essential for hair cell activity.
Organ of Corti:
Located on the basilar membrane; contains hair cells and dendrites of auditory nerve fibers.
Hair cells include inner hair cells (IHC) and outer hair cells (OHC).
Stereocilia on hair cells bend in response to basilar membrane motion, triggering neurotransmitter release to auditory nerve fibers.
Tectorial membrane: gelatinous flap that interacts with stereocilia.
Hair cells and their roles:
Inner hair cells convey most auditory information to the brain via afferent fibers.
Outer hair cells receive brain input (efferent fibers) and provide feedback that sharpens tuning and sensitivity.
Basilar membrane tonotopy and place coding:
Different frequencies cause peak motion at different locations along the basilar membrane: traveling wave mechanics
Characteristic frequency (CF) of an auditory nerve (AN) fiber is the best frequency to which that fiber responds.
Place coding: neural response is related to the place along the basilar membrane where the wave peak occurs.
Auditory nerve and pathways:
Afferent AN fibers originate in the organ of Corti and project to the cochlear nucleus in the brainstem.
Primary auditory cortex (A1) in the temporal lobe is the first cortical area to process auditory information.
Pathways include: Cochlear nucleus → Superior olive → Inferior colliculus → Medial geniculate nucleus (MGN) of the thalamus → Primary auditory cortex (A1).
Some fibers project to opposite sides after the cochlear nucleus or superior olive (bilateral representation).
Higher-order auditory areas:
Belt area (secondary auditory cortex) and parabelt area involved in more complex sound features and multisensory integration.
Tonotopic organization:
Neurons responding to different frequencies are arranged anatomically in order of frequency, from the cochlea through A1.
This organization suggests frequency composition is central to auditory perception.
Coding strategies:
Temporal (timing) code: phase locking and precise timing of neural spikes convey frequency information, especially for low to mid frequencies.
Volley principle: multiple neurons can collectively code frequency by firing at distinct phases of the period without firing on every cycle.
Temporal aspects and neural coding:
Phase locking: a neuron fires at a consistent phase of the sound wave cycle.
Temporal code uses spike timing relative to the period of the sound.
Important concepts:
Place coding dominates high frequencies; temporal coding is more robust at lower frequencies.
CF (characteristic frequency) and the density of AN fibers contribute to precise frequency determination.
Anatomical/functional notes:
Outer hair cells contribute to cochlear amplification and sharper tuning, improving sensitivity of inner hair cells.
Hair cells do not regenerate in mammals; some other vertebrates show regenerative capacity.
9.4 Basic Operating Characteristics of the Auditory System
Psychophysics and psychoacoustics:
Psychoacoustics studies the relationship between physical acoustics and perceptual responses (how the brain interprets sounds).
Key concepts:
Loudness: psychological aspect related to perceived intensity.
Pitch: psychological aspect related mainly to perceived frequency.
Audibility threshold:
The lowest sound pressure level detectable at a given frequency.
Equal-loudness curves: graphs of sound pressure level vs. frequency for constant perceived loudness (pink oval in slides indicates best sensitivity region).
Temporal integration:
A sound at a constant level is perceived as louder when its duration is longer.
Typical integration window is about 100-200\ \text{ms}; sounds shorter than ~100 ms may be perceived as quieter than the same sound played for ~200 ms.
Masking and critical bandwidth:
Masking: a second sound (often noise) makes detection of another sound more difficult.
White noise contains all audible frequencies in equal amounts; serves as a reference mask.
Critical bandwidth: the range of frequencies that can be conveyed within a single auditory channel.
Example: for a 2000 Hz tone, the critical bandwidth ends at about 400 Hz.
Real-world example:
Manatees have good underwater hearing but are less sensitive to low-frequency sounds from boat engines; a high-frequency alert sound can be used in front of boats to protect them.
Practical implications:
Noise exposure can cause temporary or permanent threshold shifts or tinnitus.
9.5 Hearing Loss
Definition:
Hearing loss = the need for higher sound levels to detect and understand sounds, affecting perception.
Distinguishes sensation (detection) from perception (interpretation).
Hearing loss types:
Conductive hearing loss: problems with the bones of the middle ear; Otosclerosis (abnormal bone growth) can be treated with surgery.
Sensorineural hearing loss: most common, involving cochlear or auditory nerve defects; hair cell damage (infection, ototoxic drugs, metabolic factors) or aging.
Hair cell damage and neural consequences:
Damage to outer hair cells reduces frequency selectivity and sensitivity of AN responses.
Aging can reduce AN fiber counts; Stria vascularis may fail to maintain endolymph ionic balance, reducing hair cell activity.
Regeneration and fossils:
Mammals: hair cells do not regenerate.
Other vertebrates (e.g., fish, amphibians, birds) can regenerate hair cells.
Additional phenomena:
Tinnitus: ringing in the ears due to prolonged exposure to loud sounds.
Hidden hearing loss: normal sensation but reduced perception due to synaptopathy (loss of synapses between AN fibers and hair cells), leading to poorer information transfer in auditory cortex.
Common causes and remedies:
Noise-induced hearing loss is common; exposure to sounds above roughly 120 dB can cause immediate damage.
Temporary threshold shift: muffled hearing after noise exposure; may recover unless exposure repeats.
Hearing aids vs. cochlear implants:
Hearing aids amplify sounds; modern devices also implement dynamic range compression to avoid painful/ damaging levels and to fit within comfortable levels.
Cochlear implants: surgically implanted device that bypasses damaged hair cells by directly stimulating the auditory nerve via electrodes; external microphone/transmitter communicates with implanted receiver.
Summary points:
Hearing loss can be congenital or acquired; can be hereditary or age-related.
Sensorineural loss is the most common and involves the cochlea or auditory nerve.
Management includes hearing aids, cochlear implants, and protective strategies to avoid noise-induced damage.
Connections to foundational principles and real-world relevance
The auditory system is organized to convert mechanical energy into neural signals via a cascade of mechanical, hydrodynamic, and neural processes, illustrating physical-to-neural transduction.
Place coding and tonotopy reflect a fundamental principle: spatially distributed encoding maps to perceptual choices (frequency content) and can be traced from the cochlea to A1.
Temporal coding (phase locking and volley) complements place coding, especially for encoding lower frequencies where timing information is more reliable.
Psychoacoustic phenomena (masking, critical bandwidth, temporal integration) demonstrate how perception constrains and shapes the use of sounds in real-world environments (speech, music, alarms).
Practical implications for health: understanding thresholds, dynamic range, and noise exposure informs hearing conservation, device design (hearing aids, cochlear implants), and public health guidelines.
Ethical/practical implications include access to hearing care, protection in noisy environments, and the quality of life impacts of hearing loss and tinnitus.
Key equations and numerical references (LaTeX)
Speed of sound in different media:
v_{ ext{air}} \approx 340\ \text{m/s}
v_{ ext{water}} \approx 1500\ \text{m/s}
Reference pressure for air: p_0 = 20\ \mu\text{Pa}
Sound pressure level (SPL):Lp = 20 \log{10}\left(\frac{p}{p_0}\right)
Pressure ratios and dB examples:
\frac{p}{p0} = 10 \Rightarrow Lp = 20\ \text{dB}
\frac{p}{p0} = 100 \Rightarrow Lp = 40\ \text{dB}
Doubling pressure: \Delta Lp = 20 \log{10}(2) \approx 6.02\ \text{dB}
Human hearing frequency range: f \in [20, 20000]\ \text{Hz}
Dynamic range (amplitude): about 1:10^6 \Rightarrow 120\ \text{dB}
Temporal integration window: \approx 100-200\ \text{ms}
Critical bandwidth example: for a 2000 Hz tone, bandwidth ends at about 400 Hz.
Sine wave and Fourier form for a complex signal:
x(t) = \sum{k} Ak \sin\left(2\pi fk t + \phik\right)
Relationship between place and frequency: tonotopic organization maintained from cochlea to A1.
Temporal coding concepts:
Phase locking: neurons fire at a consistent phase of the waveform cycle.
Volley principle: multiple neurons share frequency coding by firing at distinct phases.
Practical study tips based on the notes
Memorize the key structures of the outer, middle, and inner ear and the sequence of sound transmission.
Familiarize yourself with the difference between dB SPL, frequency (Hz), and perceptual concepts (loudness, pitch, timbre).
Be able to explain place vs. temporal coding and identify which frequencies rely more on timing cues.
Understand masking and critical bandwidth for real-world hearing, including practical implications like speech in noise.
Recognize the differences between conductive and sensorineural hearing loss and typical treatments.
Use the provided numbers to answer quick calculations (e.g., compute dB changes from pressure ratios, or interpret typical frequency ranges).
Relate the anatomy to function: how the cochlear partitions, hair cells, organ of Corti, and basilar membrane contribute to transduction and frequency processing.