1/26
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
vocal tract made up of
vocal folds (soud nosurce) and resonating cavities: glottal, pharyngeal and nasal
production of sound has 3 steps
initiation from energy source- vocal fold vibration or via turbulence of air through a constriction
resonance of sound producer
radiation of sound into air
Comes out of our oral cavity or nasal cavity via pharyngeal cavity after going through filters.
Depending on the length or the shape of these different cavities that sound is passing through, different frequencies are being resonated or attenuated.
block diagram of speech production explain
air goes up vocal folds (vibrating or not) → makes noise (periodic or aperiodic) → goes through vocal tract (or constricted) and filtered to make speech
when breathing vocal folds are
open
when eating vocal folds are
closed
Cartilage is responsible for tightening and loosening of
vocal folds, changing tension.
what causes vocal folds to open and close really fast.
Subglottal pressure
Fundamental frequency affected by
- length of vocal folds (men and women)
- tension in vocal folds
- mass of vocal folds
- pressure in trachea (affected for people with respiratory issues)
Men have longer vocal folds so
lower fundamental frequency.
Mucus increases mass of vocal folds
lower fundamental frequency of source sound.
subglottal pressure relationship with F0
If we increase the subglottal pressure we increase the F0 of a sound.
Linear relationship.
linguistic significance of F0
we can change it as we speak. rises and falls during speech = intonation to speech
provides prosody or suprasegmentals (musicality) of speech providing meaning, emotion and emphasis
3 major contributors to prosody
fundamental frequency (voice pitch)
amplitude envelope (vocal effort)
duration and rhythm (timing)
F0 of speech is in what range
60 to 500Hz
quarter wavelength resonators
The vocal tract acts as a resonant tube or quarter wave resonator (tube open at one end and closed at the other) that amplifies certain frequencies produced by the vocal folds.
prosody (musicality) suprasegmental information is affected by what
• Requires good perception and control of one’s own voice (laryngeal and breathing musculature)
• Hearing impairment (particularly congenital) may affect this control leading to unnatural speech, having trouble with altering the F0 of voice, and so have trouble providing intonation and prosody in their speech
what frequencies are maximally resonated by quarter wavelength resonators
- When the wavelength of the tone is 4 times the length of the tube, that frequency will be maximally resonated.
- Quarter (of the) wavelength resonators provide acoustic conditions where compressions of the longitudinal wave line up and reinforce themselves.
F1 also named
first formant
the centre frequency of the first of a series of resonances.
F1 =
s/4L where L is length in metres and s is the speed of sound
EXAMPLE: If a tube is 30cm long, what is the first resonant frequency (i.e. first formant, F1)?
F1 = s/4L
F1 = s/4 x 0.30m
F1 = 334/1.2
F1 = 278.3 Hz
the centre frequency of a formant is the
formant frequency or (Fn).
first peak = formant 1
spectrograms
Show amplitude and frequency variations of speech against time:
y-axis shows frequency in Hz (linear scale).
x-axis shows time (ms).
darkness (or lightness) shows amplitude (dB).
spectrograms made using what type of filters
sets of band pass filters.
wide band or narrow band
Better accuracy for timing leads to poorer accuracy for frequency
Prosody requires good
biofeedback. The speaker must accurately perceive their own voice to monitor and control their laryngeal and respiratory musculature.
Individuals with hearing impairment, particularly those with congenital or early-onset hearing loss, lose this auditory feedback loop (produce reduced prosodic variability flatter more monotone f0 contour)
spectrograms frequency part that is important for speech
100Hz to 3kHz is the region where the most important speech information lies
brain hears for the relative difference between
f1 and f2
formants change position formant transition due to
movements of tongue and mouth during pronunciation of different words