Based on Assignment 1, the study guide, and items on the slides.
sound
the transfer of energy through an elastic medium
What are the three necessary components for sound to occur?
energy - ex: lungs
a body capable of vibration - ex: vocal folds
transmitting medium - ex: air
What are the physical properties of a sound source and transmitting medium?
the source must be able to vibrate; vibration requires mass and elasticity
the medium must be capable of being set into vibration; it also requires mass and elasticity
mass
the amount of of matter present
weight
gravitational force on an object
mass vs weight
our mass is the same on Earth and on the moon, but our weight would be different ex: A 110 lb person on Earth would be 18 lbs on the moon, but the mass would be the same regardless where they are
density
the amount of mass per unit volume
elasticity
the property that enables recovery from distortion of shape or volume ex: air
pressure
the amount of force per unit area ex: atmospheric pressure is 14.7 lb / inĀ²
Why arenāt we able to hear sound in space?
There are too few particles, meaning there is nowhere for sound to go or any air molecules to bump into
simple harmonic motion
the swaying back and forth motion of molecules or a disturbance in a medium in which particles are disturbed perpendicular to the direction of disturbance; also known as sinusoidal motion
condensation
when simple harmonic motion occurs, particles in a medium are pushed together
rarefaction
when simple harmonic motion occurs, particles in a medium are pulled apart
What are the 5 measurements of a sine wave?
amplitude
frequency
period
phase
wavelength
amplitude
perceptually equates to loudness; maximal displacement of particles in a medium
rms amplitude
average amplitude of the waveform over time (rms = 0.707 x peak amplitude); used to measure the amplitude/intensity of a voice
frequency
number of complete cycles per second (Hz); pitch = the perceptual correlation of frequency; pitch associated with human voice = F0 ex: 250 Hz higher perceived pitch than 100 Hz
f= 1/T
cycle
a complete repetition of the oscillation
simple sounds
single frequency (pure tone ex: tuning fork); not commonly found in nature and has no change in frequency; useful for tests and measurements
complex sounds
sounds containing more than 1 frequency (speech)
period
time that it takes for a vibrating object to complete one cycle of vibration
T = 1/f
phase
represents the point in the cycle at which the vibrating object is located at a given instant in time
in-phase
two sinusoids are in phase when their wave disturbances crest and trough at the same time
out-of-phase
sine waves (180Ā°) are out of phase when they are out of sync, and thus create a cancellation effect; commonly used for active noise reduction systems ex: headphones
wavelength
distance that a sound wave travels during one complete cycle of vibration
Ī»=s/f - measured in meters (f in Hz and speed = 340 m/sec)
frequency vs wavelength
they have an inverse relationship; low frequency = longer wavelength, high frequency = shorter wavelength
decibel
unit used for measuring the amplitude/intensity (loudness) of sound; we use decibels because it makes our numbers easier to work with and interpret
peak clipping
Peaks of the waveform are cut off due to amplifier circuits being overdriven
Solutions: dynamic compression and limiting ā> makes sounds more consistent
transverse wave
the medium is displaced perpendicular to the direction of the traveling wave ex: light
longitudinal wave
the medium is displaced parallel with the direction of the traveling wave ex: sound
speed of sound
depends on 2 properties: 1. stiffness 2. density
high speed of sound depends more on the stiffness of dense materials
imperial system
foot, pound, second (fps)
metric system
meter, kilogram, second (mks); centimeter, gram, second (cgs)
band pass filter
signals that only allow certain frequencies through the system; reason why voice sounds thin + raspy over the phone
300-3,400 Hz band pass filter - sounds most important for speech
Note: digital systems like Face Time are not tied down by band pass, which is why FT is crisper since not much filtering is occurring
fundamental frequency
the lowest rate of vocal fold vibration (F0, first harmonic); associated with pitch - smaller/shorter vocal folds will vibrate at a higher frequency than larger/longer vocal folds (the reason why children have higher-pitched voices)
F0 of adults = 80-300 Hz
F0 of children = 200-500 Hz
harmonics
an integer multiple of the fundamental frequency
ex: F0 = 100 Hz, 2nd harmonic =200 Hz, 3rd= 300 HZ, 4th= 400Hz
F0 = 150 Hz, 2nd= 300 Hz, 3rd = 450 Hz
Which harmonics make up formants necessary for vowel discrimination?
3rd and 5th harmonics
formants
created by the resonance of sound transmitted through the vocal tract
What is the relationship between vocal tract length and formant frequencies?
formants have an inverse relationship with vocal tract length
ex: longer vocal tract = lower frequency formant values; shorter vocal tract length = higher frequency formant values
Which tongue positions affect F1 and F2?
F1 is determined by the tongue height (the higher the tongue is, the lower the frequency); F2 is determined by the back-ness/forwardness (the further forward the tongue, the higher the frequency)
F3
related to vocal tract length
acoustic resonance
an object or a system vibrates at a specific natural frequency when exposed to an external sound wave or vibration; all objects have a frequency at which they vibrate best at
cavities/spaces
generally have lower resonant frequencies, smaller cavities have higher resonant frequencies (based on particle velocity and wavelength)
categorical perception
the psychoacoustic phenomenon where sounds are generally perceived as distinct categories; perception changes rapidly when some acoustic attribute (ex: VOT) is varied
source filter theory
describes acoustic output during speech-sound production; comprised of:
source - voiced or voiceless signal
filter ex: vocal tract
voice onset time
refers to the duration between the release of a plosive/stop sound and the beginning of vocal fold vibration; voiceless consonants have longer VOT than voiced consonants
long term average spectrum (LTAS)
describes the frequency (pitch) distribution of energy for speech produced over a brief period of time (1-2 minutes); decreases in amplitude at about -6 dB/octave
male vs female LTAS
LTAS for male speakers contains more energy below 1000 Hz
female voices have increased high-frequency energy
What are the 3 factors used to classify consonants?
place - location of airflow restriction in the oral cavity during speech production
manner - how sound is produced
voicing - presence or absence of vocal fold vibration
Ling 6 speech sounds test
sound test that is a set of 6 speech sounds: /É, i, u, Ź, s, m/
each sound represents a frequency region (low, mid, high)
commonly used as a daily biologic check to confirm amplification device function (hearing aids)
continuous signal
a signal that has a value at any given time
discrete signal
a signal produced by sampling amplitude at a given rate
amplitude quantization
capturing the amplitude of the continuous signal, more bits = more lvls of amplitude
What does a high bit rate wave look vs. a low bit rate wave?
most current digital systems use 16- or 24-bit, as these values provide for sufficient and high-quality reproduction of continuous signals for most auditory applications
sampling frequency/rate
refers to the number of amplitude samples taken (i.e. sampled points) in a given period of time, typically one second
commercially available audio and speech processing systems, ranging from 8kHz to over 300 kHz
ex: sampling freq. of 16 kHz means that amplitude values are obtained 16,000 times each second
Nyquist Theorem
states that the sampling frequency required for a given application must be at least twice that of the highest frequency of interest in the output signal (at least two samples per cycle are required)
ex: for 20 kHz, sample rate would be 40 kHz
According to the Nyquist Theorem, what is the minimum sampling rate required to accurately digitize a 20,000 Hz sine wave?
40 kHz (20,000 Hz = 20 kHz), so 20kHz x2
aliasing
occurs when a signal is sampled at a frequency that is insufficient for the application; distortion that occurs when reconstructed digital signal is different from the original signal - occurs when Nyquist Theorem has been violated
auditory recruitment
when quiet sounds become inaudible and loud sounds become uncomfortable; result of sensory hearing loss (cochlear outer hair cell damage) leads a reduced dynamic range and abnormal growth of loudness
solution: hearing aids, can help amplify sounds and compressor helps with maintaining lvls of sound
How does stiffness and density of a medium affect the speed of sound?
greater stiffness of a medium will result in greater speed ā> determines speed better than density, but less density of the medium will result in greater speed
ex: sound is faster in water than in air due to the greater stiffness of water
Label the axis of the spectrogram of the vowel sound /i/ using the following values: F1, F2, F3
F1 250 Hz, F2 2,200 Hz, F3 2,800 Hz