Vowels and Formants
Source Filter Model
- The speech signal is the result of the vocal tract functioning as a frequency-selective filter (Textbook p. 197-8).
- Both source and filter vary continuously during speech.
Source Filter Theory
- Source: Noise generated by vibrating glottis, which is a complex waveform broken down into frequency x amplitude components (harmonics).
- Filter: Vocal tract filters this noise, promoting some frequencies and suppressing others.
- Output: Acoustic output with peaks corresponding to formants (peaks of resonance).
- Spectrum = Frequency x Amplitude display.
- Spacing = f0
- Speech synthesis involves creating an artificial speech waveform by combining the source and the filter.
- Frequencies are generated by the source, and the filter response is generated by the vocal tract.
- Amplitude = darkness in patterns. Change in f0.
Acoustic Analysis
- Acoustics is the analysis of sound waves and the physical properties of speech.
- All sound results from vibration, which depends on a source of energy to generate it (CYF 2007, p.205).
Vowel Space
- Vowel space shows where vowels are located in acoustic space relative to one another, often labeled in Hz.
- The space is continuous, and vowel production can vary greatly, meaning there are no fixed F1/F2 values.
- Praat is software used for acoustic phonetics to load sound files (typically .wav) and view waveforms and spectrograms.
- Formants are resonating frequencies of the air in the vocal tract; peaks of resonance (L&J Ch.8, p.197).
- A vowel sound contains a number of different pitches simultaneously.
- The quality of a vowel depends on its overtone structure, i.e., formants (usually F1 and F2).
- Vowels are distinguished from one another by differences in overtones (formant differences in F1 and F2, also F3).
Tube Models
- The vocal tract can be modeled as a series of tubes open at one end.
- Vocal fold vibration sets air in vibration.
- Different vowels have different shapes of the vocal tract and different places of constriction.
- e.g., at the hard palate [i], back/pharyngeal region [ɑ]
- A = Front cavity, C = Rear Cavity, B = area of maximum constriction
- F1 frequency is closely related to the area of the lower portion of the pharyngeal cavity (C).
- F1 is also related to Degree of mouth opening at lips (M).
- F2 is closely related to the length of the front cavity (A).
- The resonant frequency (formant) is low for close vowels and higher for open vowels.
- The resonant frequency (Formant) is high for close front vowels and lower for open or back vowels
- There isn’t an easy way to demonstrate F3 and higher formants
- Resonances or Formants numbered from low to high frequency, e.g., F1 F2 F3.
- Vowels are primarily classified in terms of the first two formants, e.g., F1 and F2, although F3 can be used to determine rounding.
- Changes in the relative formant values give vowels their quality.
- F1 equates to opening:
- Close vowel = low F1
- Open vowel = high F1 (inverse relationship to aperture/height)
- F1 of /i/ is lower than F1 of /æ/
- F2 equates to backing:
- Backer vowels, lower F2
- F2 of AusEng /ʉ/ would be higher than F2 of Spanish /u/
- Rounding lowers both F1 and F2, so [y] has a lower F1 and F2 than /i/
- F1 value INCREASES as vowel aperture opens from close to open & pharyngeal cavity decreases in volume
- F2 LOWERS as you move from front to back – lengthening the front cavity by backing the tongue.
- ROUNDING lowers F1 & F2 - elongates oral cavity (works well with backing!)
- F3 thought to be important for distinguishing front unrounded [i] from [y]
Mini Quiz
- Low F1 and high F2: likely to be /i/
- Low F1 and low F2: likely to be /u/
- High F1 and high F2: likely to be /a/ or /ɑ/
Vowel Spectrograms
- Typically measure at the vowel midpoint in AmEng.
- Speaker physiology (larger vocal tract formants typically lower).
- Language or Dialect (e.g., American English versus Standard Australian English).
- Number of contrastive vowels (Aus Indigenous languages vs. AusE).
- Stress & Accent:
- Unstressed vowels might show formant “undershoot.”
- Vowels do not achieve their F1/F2 targets.
- Reduced vowel quality (vowels that are unstressed tend to head towards schwa [ə]).
- Casual speech or rapidly spoken vowels also show undershoot.
- Consonant environment – coarticulation.
- CLEAR SPEECH – more peripheral vowels.
Physiological Differences
- The length and thickness of the vocal folds affects the Fundamental Frequency.
- Thicker, bigger vocal folds = lower F0 or pitch.
- The length of the vocal tract changes the resonant frequency of a voice.
- Longer vocal tracts = lower formants in general.
- Children have shorter, smaller vocal tract lengths, on average, than adults.
- Relative F1 F2 is the key.
Kunwinjku (Australian) Vowel System
- Small vowel systems typically show a lot of variation within a vowel category.
- Not the same need to keep 15 monophthongs apart (unlike N. Frisian)!
Biological Differences
- Same shape, different values.
- German Lax Vowels transcribed in MRPA (machine readable phonetic alphabet – IPA symbols shown in red - (from Harrington 2010)
- Vowel midpoint
- /æ / F1 = 800 Hz, F2 = 1614 Hz
- /iː/ F1 = 300 Hz, F2 = 2420 Hz
- /ʊ/ F1 = 360 Hz, F2 = 1000 Hz
- /ɔ/ F1 = 600 Hz, F2 = 980 Hz
- /ʉ/ F1 = 327 Hz, F2 = 1760 Hz
Additional Activities
- Video Praat Demonstration to show you how to plot practice.
- Download sound files and save them in an easily accessible location.
- Use a printed vowel space or annotate a PDF on screen (or draw in your textbook).