LC

Unit 3

Hearing

  1. Stimulus for hearing

Movement of air molecules. The sound source vibrates and those vibrations are carried by waves of air molecules. We hear from 20Hz to 20000 Hz

A simple sine wave is a pure, smooth wave that represents the most basic kind of stimulus. Only vibrating at one frequency

  1. Relationship between physical properties of sound and their related perceptual qualities

Physical:

1. Frequency: refers to the # of cycles per second. Measured in terms of Hertz (Hz).

2. Amplitude: Refers to the height of the waveform. Measured in decibels (db) More energy = more decibels

3. Complexity: Refers to the number of pure tones combined together

Psychological

1. Pitch/Tone: E.g., the notes in music

2. Loudness: lecture voice about 80db

3. Timbre: The quality of sound that allows one to distinguish it from another sound

  1. Other factors that affect our perception of Loudness, Pitch, and Timbre

  1. Loudness - determined primarily by Amplitude

  1. Also affected by Frequency

  1. Equal loudness curves - most sensitive to middle range frequencies (500 Hz to 5000Hz) They sound equally loud to our ears.

B. Pitch - determined primarily by Frequency

  1. Also affected by Complexity

  1. Increases in complexity increase perceived pitch

2. In a complex waveform, the pitch perceived is that associated with the fundamental frequency

(Slowest regularly oscillating wave)

C. Timber - depends primarily on the Complexity of the waveform

  1. Also affected by the “attack” and “decay” portions of the waveform

  1. Attack refers to the speed of onset (e.g., slow versus rapid attack)

  2. Decay refers to the speed of offset (e.g., slow versus rapid decay)

Example “Bowed” versus “plucked in a violin

IV. Mechanisms associated with locating sound source

  1.  Interaural Time Difference (ITD)

  1. The brain uses differences in time in which a sound reaches both ears to determine the location of the sound.

  1. ITD is a useful cue for frequencies below ~ 1300 Hz (e.g., a low frequency sound source)

B. Interaural Intensity Difference (IID)

  1. IID is a useful cue for frequencies above ~ 1300 Hz may cause sound shadow.

C. Cone of Confusion - refers to points in space such that sound sources produce identical interaural time and intensity differences

Other Terms

An octave is a way of describing doubling or halving a frequency — it's a very natural "jump" in sound or light. An octave is when you go from one note to the same note but higher or lower — not louder, but higher in pitch.

A harmonic is a sound that happens when you have extra vibrations stacked on top of the main vibration When you pluck a guitar string, you don't just hear one pure note. You hear the fundamental note plus a bunch of harmonics.The fading sounds are the harmonics.
Place Theory says: Different places along your inner ear (specifically, the basilar membrane in the cochlea) respond to different pitches (frequencies). High-pitched sounds (like a whistle) vibrate the beginning (base) of the cochlea. Low-pitched sounds (like a bass drum) vibrate the end (tip) of the cochlea.
Temporal Theory (also called Timing Theory) says: We hear pitch based on how fast the neurons in our ear fire (the timing of their firing matches the sound's frequency). A sound at 100 Hz = neurons fire about 100 times per second. Phase locking means that neurons fire at the same point (phase) of every sound wave cycle — they "lock on" to the rhythm of the wave. Volley Principle says that groups of neurons take turns (like a team!) to fire together and keep up with higher frequency sounds.

Fusion Effect:

When two sounds are very close together in time (milliseconds apart), your brain combines them into one single sound.


Precedence Effect (also called the "Law of the First Wavefront"):

When two identical sounds come from different places, but with a tiny delay, your brain hears only the first one and ignores the later one to figure out where the sound came from.


Echolocation in the Blind:

Some blind individuals learn to use echoes (reflected sound waves) to "see" their environment — just like bats and dolphins do! They make clicking sounds with their tongue, tapping canes, snapping fingers, or even listening to footsteps.

They then analyze the returning echoes to figure out:

How far away an object is

What shape it is

Whether it’s soft, hard, wide, narrow, etc.
Architectural acoustics is the science of designing buildings (like concert halls, classrooms, churches, theaters) to control how sound behaves inside them.

It’s about making sure that:

Sounds are clear (not muddy or echoey)

Sounds are loud enough (but not too loud)

Sounds reach everyone in the room evenly

Background noise is minimized

Hearing Part 2

  1. The Three Main Structures of the Ear

  1. Outer Ear (collects sounds waves and funnels them to the ear drum)

  1. pinna (fleshy ‘articulated’ outer portion)

  2. auditory canal (funnels sound to ear drum)

B. Middle Ear (magnifies vibrations from the eardrum to the cochlea)

  1. tympanic membrane (ear drum)

  2. ossicles – three bones (increases PSI to inner ear)

a. malleus (hammer)

b. incus (anvil)

c. stapes (stirrup)

C. Inner Ear (converts mechanical energy to chemoelectric signals to the brain: sound!) 

  1. Cochlea (snail-shaped pea-size) 

a. Three fluid-filled chambers

  1. Scala vestibule (upper chamber)

  2. Scala media (middle chamber, also called the ‘cochlear duct’; contains Organ of Corti) 

  3. Scala tympani (bottom chamber)

b. Oval window – flexible structure on upper chamber of cochlea; stapes rests on this structure; transmits pressure wave to upper and middle chambers 

c. Round window – flexible structure on bottom chamber of cochlea; pressure release

d. Helicotrema – opening at the apex of the cochlea that allows vibrations from the upper chamber to be funneled back down to the lower chamber.

D. Scala media (middle chamber; also called the cochlear duct

1. Reissner’s membrane (separates Scala vestibuli from Scala media)

 2. Basilar membrane (separates Scala media from Scala tympani; vibrations transmitted to hair cells) 

3. Contains Organ of Corti 

  1.  Inner hair cells (frequency perception)

  2.  Outer hair cells (loudness perception) 

  3. Tectorial membrane (outer hair cells embedded)

II. Neural Code for Pitch Perception 

A. Timing/Frequency Theory – pitch coded with respect to the rate of neural firing. 

1. The rate at which neurons fire depends upon rate of hair/cilia bending, which in turns depends upon frequency.

 For instance, a 500 Hz tone causes hair/cilia to bend 500 times per second, which causes neurons to fire 500 times per second 

2. Problem: this places the upper limit of pitch perception to ~ 1000 Hz (fastest rate a neuron can fire) 

3. Solution: Brain can code for frequencies above 1000 Hz by virtue of the Volley Principle; neurons fire sequentially

i. Several neurons firing sequentially can code for frequencies above 1000 Hz as long as the neurons are phased locked. 

ii. as long as neurons fire at the same point in the wave form they are phased locked.

4. Next Problem: the upper limit of phase locking is ~ 5000 Hz!

So, how does the brain code for frequencies above 5000 Hz? 

B. Place Theory – pitch coded with respect to the place on the basilar membrane that vibrates most vigorously in response to a particular frequency

C. Timing/Frequency and Place Information also work together: Note that the Timing/Frequency and Place mechanisms overlap between 1000Hz and 5000Hz. This happens to be the frequency range associated with human speech.

Speech Perception

The capacity to hear separate words is an example of segmentation and parsing.

Segmentation and parsing are interpositional words.

  1. Segmentation/Parsing: 

Refers to our ability to hear separate words in an otherwise continuous vocal stream; there are no physical spaces between words yet we hear them as separate.

  1. Fundamental Units of Speech

  1. Phonemes: fundamental unit of speech such that changing a single phoneme in a word changes the meaning of that word. The upper three rows represent phonemes associated with vowels; the bottom three rows are phonemes associated with consonants.

  1. Vowels: created by manipulating a relatively open vocal tract. 

  2. Consonants: created by manipulating a relatively closed vocal tract

  1. Place of Articulation: refers to where in the vocal tract the obstruction of the air stream occurs (e.g., “p” in put = front of mouth; “c” in couch = back of mouth)

  2. Manner of Articulation: refers to how the air stream is obstructed (e.g., “p”, “b”, “f” all obstruct the air stream in the front of the mouth)

  3.  Voicing/Voicing Onset Time: refers to the timing and degree to which the vocal chords vibrate (e.g., “da” vibrates earlier than “ta”)

III. Is Speech Special? Does our brain process the speech signal different from non-speech signals?

  1.  Evidence that speech is special

  1.  Infant babbling includes phonemes from all languages; universal stages of babbling

  2. Invariant characteristics of the speech signal (e.g., Formants and Formant transitions for “shoo cat” are the same irrespective of the speaker)

  1. Formants: the narrow bands of sound frequency energy associated with vowels. Depends on the fundamental frequency

  2. Formant transition: the transition from a broad energy spectrum to the narrow Formant spectrum. The transition per se appears to code for consonants.

3. Categorical perception: a phenomenon where a phoneme is perceived to be invariant within some specific range.

B.  Limitations of evidence for “specialness” speech

1. Invariant characteristics: perception of speech is robust even with significant deterioration of

formant components of the waveform.

2. Categorical perception: non-speech signals have also been shown to be processed categorically

• key: sound is given a name (e.g., “plucked” vs. “bowed”)

C. Infant babbling, formant characteristics, and categorical processing are involved in speech perception but are only part of the story.

1. Pickett & Pickett (1963) “sliced conversation” study demonstrated the importance of context in speech perception

The McGurk Effect is where your visual perception of someone's mouth movements can actually change what you think you are hearing.

Damage to Broca’s Area → Broca’s Aphasia:

People understand language pretty well. But they struggle to speak — speech is slow, broken, and effortful.


Damage to Wernicke’s Area → Wernicke’s Aphasia:

People can speak fluently, but what they say often doesn't make sense (nonsense words, jumbled ideas). They can't understand what others are saying well either.

Music Perception

Extra credit:

Crazy Little Thing Called Love - Queen

Sugar Sugar - The Archies

The doors - Love her madly 

Georgia satellites 

Bad things - Jace Everat

Dream on, Aerosmith

The trees - Rush

Bahamian Rap City - 

A. Proximity - notes that are played together close in time are perceived as belonging together.

Proximity forms the basis of melody perception.

B. Similarity - notes that are similar in pitch are perceived as belonging together.

C. Good continuation - A musical principle where melody progresses logically, allowing listeners to predict the path of the notes.

Basic Components of Music

A. Consonance / Dissonance - notes that sound pleasing when played together are referred to as consonant. Notes that sound unpleasant when played together are referred to as dissonant.

B. Rhythm - refers to the temporal relationship among sounds. Rhythm refers specifically to the

duration of each note or chord. Polyrhythm refers to the simultaneous playing of two independent rhythms.

C. Note - the listener’s perception of the frequency of the sound waves. Each musical sound of a distinct frequency is known as a note (e.g., the note A is 440 Hz; the note C is 523.3 Hz). Pitch indicates how high or low a note sounds. Harmonics refers to integer multiples of the note in question (e.g., the fundamental frequency); the harmonics of a note sound consonant.

D. Chord - multiple notes (usually 3 or 4) occurring simultaneously in time. Major chords (happy) and Minor chords (sad); middle note of minor chord a little lower in pitch than that of the major chord.

E. Melody - refers to the combined effect of pitch and rhythm.

F. Melody schema – refers to the representation of a familiar melody stored in memory.

2. Musician Stylistic Contributions to Music

A. Playing out of synchronization - Chords tend to sound more musical when each note is played

slightly out of synchronization; playing all the notes together tends to sound less musical.

B. Staccato vs. Legato - “choppy” playing (staccato) tends to not sound as musical as a smoother

transition (legato) between notes/chords.

C. Rubato – refers to “small expressive changes of pace”; changing the pace of play tends to sound more musical than to play at the same rate throughout the piece.