1/12
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
summary
Phonation is the main sound source in speech Phonation has a number of important linguistic and paralinguistic uses Modal voice is characterized by complete approximation, regular cycles, sharp closures Whisper, Breathy, Creaky and Falsetto voice can be contrasted with Modal voice Voice quality can serve as a social marker and, therefore, can differ across generations and social groups Signal measures exist which can be used to characterize voice quality
Tutorial 4: Linking Ideas
Beware of confounding factors/variables! There is a relationship between F0 and height, but... The relationship is due to gender (very little F0 overlap between genders) The relationship is not due to height (much height overlap between genders) We can predict F0 range if we know the gender of the speaker We can’t predict F0 range if we know the height of the speaker Acoustic measurements exist that can characterize different voice qualities, due to how these measurements capture effects of the different physiological and aerodynamic characteristics of phonation in each of the different voice qualities Using these measurements, we can identify differences in phonation (e.g., different speaker groups, different languages, differences in voice health, etc.)
IPA symbols for vowels = loosely based on lingual position
The IPA vowel chart gives you information related to three characteristics: 1 Height (high/low = close/open) 2 Backness (posterior/anterior = back/front) 3 Rounding
Source-filter model
The vocal tract filters the glottal source sound
Frequency response peaks = formants
The amplitude spectrum of an unobstructed vocal tract shows a small number of peaks caused by resonances of the tube. These resonant peaks are called formants.
Applying the filter to the source
The effects of the filter can be characterized by measuring the first few formants
Cardinal vowel formants: F1 & F2
Most vowels can be characterized well by the frequencies of only the first two formants, a.k.a. F1 and F2.
Formant frequency normalization
Normalization = “Standardize measurements to single speaker” Normalization in perception = “Adaptation” Vocal-tract length normalization scale frequencies by relative height or relative vocal tract length Usage normalization scale formant frequencies to same mean and standard deviation for all speakers (z-score transformation) calculation: z = x−M SD , where M = the mean of the data (centering) and SD = its standard deviation (scaling)
Diphthong:
transition between two vowel qualities
Spectrogram: changing filter!
The first spectrum comes from a portion of the [O] phonetic component of the vowel /OI/ The second spectrum comes from a portion of the [I] phonetic component of the vowel /OI/
Tutorial 5: Linking Ideas
Tasks: measure F0 and F1-F3 in vowels; synthesize vowels using these values Vowels can be characterized and separated by their first two formants (F1, F2) Formant variation across speakers could be due to differences in pronunciation and/or differences in the physical size of the speakers’ vocal tracts Formant normalization helps reveal linguistically relevant variation by accounting for and adjusting linguistically irrelevant variation Simple vowel synthesis can be obtained by filtering a sawtooth waveform using a filter with a frequency response similar to the human vocal tract By setting the F0 of the sawtooth waveform to the F0 of your own voice, and by using the formant frequencies that are present in your own voice, the synthesized vowel will sound more similar to your own vowel productions: The source characteristics are more similar to your own (phonation) The filter characteristics are more similar to your own (vocal tract resonances)
Lecture 5: Big Ideas
Vowels may be described in terms of phonology, articulation, and acoustics There are about 20 phonological choices for vowels in British English Vowel quality can be described using terms such as front-back, open-close, rounded-unrounded, short-long, monophthong-diphthong The source-filter model of vowel production explains the acoustic form of vowels The frequency response of the vocal tract tube/pipe used for vowels can be characterized using the frequencies of the first few (or even couple) formants There is a rough correspondence between the position of a vowel in the IPA vowel chart, its position in an acoustic F1-F2 vowel space, and the height of the tongue To account for physical differences between speakers, we often normalize vowel formants using z-score transformation before interpreting variation