Ultimate Study Guide
Core Concepts & Definitions
Phonetics and Phonology
Phonetics: The study of the mechanisms of speech production and speech perception, including the physical structure of speech (articulatory vs. acoustic) and the psychological/neurological components.
Linguistic Phonetics: A sub-field that uses the methods of "pure" phonetics to understand why the sound patterns of languages look the way they do.
Phonology: The study of sound patterns and how they function in a language. Some sound patterns are recurrent across many languages, while others are rarely or never seen.
The Vocal Tract
Vocal Tract: The passage in the human body through which air passes from the lungs to the lips, including the pharynx, larynx, and oral and nasal cavities.
Larynx: An organ in the neck containing the vocal folds, which can be raised or lowered and function as a piston to create an airstream.
Glottis: The space between the vocal folds. It can be closed to produce certain sounds, like ejectives.
Vocal Folds: Folds of tissue within the larynx that vibrate to produce sound (voicing). Their vibration frequency determines pitch.
Velum (Soft Palate): The soft tissue at the back of the roof of the mouth that can be raised to close off the nasal cavity or lowered to allow air to pass through the nose.
Airstream Mechanisms
Airstream Mechanism: The method of producing airflow for speech sounds.
Pulmonic Airstream: The most common airstream mechanism, generated by the lungs.
Pulmonic Egressive: The usual type of speech, where air is pushed out of the lungs.
Pulmonic Ingressive: Drawing air into the lungs, which is very unusual for speech (e.g., for hisses in Japan).
Glottalic Airstream: Airstream created by the larynx acting as a piston.
Ejective: Produced by a glottalic egressive airstream. The glottis is closed and the larynx is moved upward, compressing the air in the oral cavity. Oral pressure is very high before the sound is released.
Implosive: Produced by a glottalic ingressive airstream. The larynx moves downward, and air is sucked into the oral cavity. The vocal cords usually vibrate, leading to glottalic suction.
Velaric Airstream: The least common airstream mechanism. It's used for clicks, drinking through a straw, and smoking.
Clicks: Sounds made with two closures in the mouth: one at the velum and one in front of it. The lowering of the tongue body increases the volume and decreases the pressure of the trapped air, and the release of the front closure creates the click sound. Clicks can be voiced, nasalized, aspirated, or glottalized.
Phonation and Voicing
Phonation (Voicing): The vibration of the vocal folds.
Pitch: Determined by the frequency of vocal fold vibration. Faster vibration results in a higher pitch.
Loudness: Largely a function of subglottal pressure.
Bernoulli Effect: A principle in phonetics where the fast-moving airflow through the glottis causes a drop in pressure, sucking the vocal folds back together during the closing phase of vibration.
Voice Onset Time (VOT): The time difference between the release of a stop consonant and the onset of vocal fold vibration.
Positive VOT (+): Voiceless aspirated stops (e.g., English /tʰ/).
Zero VOT (0): Voiceless unaspirated stops (e.g., Spanish /t/).
Negative VOT (-): Voiced stops (e.g., English /d/).
Breathy Voice (Murmur): A phonation type where the vocal folds are slightly apart, producing a "leaky" sound with high airflow and some turbulence.
Creaky Voice: A phonation type where the arytenoid cartilages are held tightly together, causing the vocal folds to vibrate irregularly only at the front.
Consonants
Stops (Plosives): Consonants produced by completely obstructing the airstream in the vocal tract and then releasing it with a burst.
Fricatives: Consonants produced by turbulent airflow through a narrow constriction. They require high pressure behind the constriction and low pressure in front of it.
Affricates: A sound that begins as a stop and is released as a fricative (e.g., English /tʃ/ and /dʒ/). They are phonetically sequences but often function as single segments phonologically.
Doubly-Articulated Segments: Sounds produced with two simultaneous points of articulation, such as the labial-velar stops /kp/ and /gb/ in West African languages.
Sonorants: Sounds produced with less extreme constrictions that don't impede airflow, including nasals, liquids, glides, and vowels. They are typically voiced.
Nasals: Sonorants produced by a complete oral closure with the velum lowered, allowing air to escape through the nasal cavity (e.g., /m/, /n/). They are sometimes referred to as stops.
Trills: Sounds produced by an articulator vibrating against another. They are obstruent-like but don't endanger voicing because the obstruction is brief.
Taps and Flaps: Sounds produced by a single, quick interruption of the airstream.
Laterals: Sounds produced by an obstruction in the middle of the vocal tract, with airflow escaping laterally around the sides of the tongue (e.g., /l/).
Approximants (Glides): Sounds produced with an "uninterrupted flow of air that does not pass through a constriction sufficiently narrow to produce local turbulence".
Vowels
Vowels: Sounds produced with an open vocal tract, allowing it to resonate.
Vowel Space: A two-dimensional plane that represents possible vowel articulations based on the position of the tongue body's highest point (high/low and front/back).
Cardinal Vowels: A system of auditory reference points for describing vowels in different languages. They were devised by Daniel Jones, and their accurate production was passed down through a master-apprentice relationship.
Primary vs. Secondary Cardinal Vowels: The primary cardinal vowels are more common in languages. If a language has a secondary cardinal vowel, it's likely to also have the corresponding primary cardinal vowel.
Diphthongs: "Dynamic" vowels whose quality changes over time, often having a steady-state component followed by a gliding component.
Tense vs. Lax Vowels: A phonological distinction in English where 'tense' vowels can occur in open syllables while 'lax' vowels cannot.
Advanced Tongue Root (ATR): A phonetic feature where the tongue root is advanced, often accompanied by an enlargement of the pharyngeal cavity. It is a distinctive feature in many African languages, where it forms the basis of vowel harmony systems.
Answering the Handout Questions
Stops.pdf
Why is there an empty cell in the table for common places of articulation? The table in question (1) likely represents common voiceless and voiced stops. The empty cell is for the voiced velar stop [g], which is cross-linguistically disfavored. The empty cell may also be due to a missing place of articulation, but the context of the document strongly suggests the missing sound is [g].
What's wrong with [g]? There's a cross-linguistic bias against the voiced velar stop [g]. The reason for this bias is likely the Aerodynamic Voicing Constraint. To produce a voiced stop, the air pressure in the oral cavity must be lower than the subglottal pressure to allow air to flow through the vibrating vocal folds. For velar stops, the oral cavity is small, and pressure builds up very quickly, making it difficult to maintain the necessary pressure drop for voicing to continue. This is less of a problem for bilabial and apical stops, which have a larger oral cavity and can expand slightly during the closure to decrease the internal pressure.
What is wrong with [p]? The handout asks "But what's wrong with [p]?" in the context of a table showing stop inventories. The implication is that nothing is "wrong" with [p]. The document points out that all languages have at least voiceless stops. The table in (4) shows that voiced stops are more likely to have gaps than voiceless stops.
Why are stops excellent sounds? They have "prominent release bursts" and "contrast with adjacent segments," which is key to their perceptibility due to modulation. The perception of sound is tied to modulation, as the auditory nerve fibers adapt to monotonous stimuli.
Why do some languages only have voiceless stops? The example of Khanty shows a language with only voiceless stops: /p/, /t/, and /k/. The handout suggests this is due to the principle of "modulation," which makes voiceless stops highly perceptible.
Vowels.pdf
Why is one set of cardinal vowels primary and the other secondary? The primary set is more common across the world's languages. The relationship is implicational: if a language has a vowel similar to a secondary cardinal vowel, it will generally also have a vowel similar to the corresponding primary cardinal vowel. This is because lip rounding in secondary vowels extends the vocal tract, decreasing the frontness of front vowels and enhancing the backness of back vowels.
Obstruents.pdf
Why are ejectives more common for stops and affricates than fricatives? Ejectives are produced by raising the larynx with a closed glottis, which creates high pressure in the oral cavity. A stop closure completely seals this cavity, allowing pressure to build up efficiently. For a fricative, the constriction is open, making it very difficult to build and maintain the high pressure needed for the ejective mechanism.
Why are voiceless fricatives favored over voiced fricatives? This is also related to the Aerodynamic Voicing Constraint. To maintain voicing, the air pressure in the oral cavity must be lower than the subglottal pressure. For a fricative, air is constantly escaping through the narrow constriction. This constant airflow can lead to a rapid pressure drop behind the constriction, making it difficult to maintain voicing. Voiceless fricatives don't have this issue.
Why are contrastively nasalized fricatives rare? This is due to aerodynamic issues. Nasalization requires the soft palate to be lowered to allow air to flow through the nasal cavity. Fricatives, however, require high oral pressure to create the necessary turbulent airflow. If the velum is open, air escapes through the nose, making it nearly impossible to build and maintain the high oral pressure needed for robust frication.
Sonorants.pdf
Why are there no symbols for nasals pronounced farther back in the vocal tract (e.g., pharyngeal nasals)? Pharyngeal sounds are produced deep in the vocal tract. To produce a nasal, the oral cavity must be completely sealed off by an oral closure while the velum is lowered. It is physically impossible to make a complete oral closure at the pharyngeal place of articulation, which is an open cavity, making a pharyngeal nasal sound impossible to produce.
Why Not Many Languages Have 'g'
The scarcity of the voiced velar stop /g/ in language inventories is a direct result of the Aerodynamic Voicing Constraint. To produce any voiced sound, the air pressure below the vocal folds (subglottal pressure) must be greater than the air pressure above them (supraglottal pressure). This pressure difference forces air through the glottis, causing the vocal folds to vibrate.
When producing a voiced stop like /b/, /d/, or /g/, the mouth is completely closed, trapping air in the vocal tract. For voicing to continue, this trapped air must be able to escape or expand to maintain the pressure differential.
For bilabial and apical stops (/b/ and /d/): The oral cavity in front of the closure is relatively large. The cheeks can expand slightly for /b/, and the tongue can be pushed forward slightly for /d/, which increases the volume of the oral cavity. According to the principle that pressure and volume vary inversely, this slight expansion lowers the oral pressure, allowing voicing to be maintained for a longer period.
For the velar stop (/g/): The oral cavity in front of the closure (at the velum) is extremely small. This tiny space provides no room for expansion. As air from the lungs enters this small, closed space, the pressure builds up almost instantly. The supraglottal pressure quickly equals the subglottal pressure, halting the airflow and stopping the vocal folds from vibrating. This makes it very difficult, if not impossible, to produce a long, fully voiced velar stop.
This physiological limitation explains why some languages have no voiced velar stop, why voiced stops like /g/ are the most likely to be missing from a language's inventory if it lacks a voiced stop, and why implosives, which create their own ingressive airflow, can develop from voiced stops.
Vowels
Description | IPA Symbol(s) |
High Front Vowel | [i] |
High Back Vowel | [u] |
Mid Front Vowel | [e] |
Mid Back Vowel | [o] |
Low Front Vowel | [ɛ] |
Low Back Vowel | [ɔ] |
Vowels with Advanced Tongue Root | [i, u] (in English) |
Vowels with Retracted Tongue Root | [ɪ, ʊ] (in English) |
Consonants
Stops
Description | IPA Symbol(s) |
Voiceless Bilabial Stop | [p] |
Voiced Bilabial Stop | [b] |
Voiceless Dental/Alveolar Stop | [t] |
Voiced Dental/Alveolar Stop | [d] |
Voiceless Velar Stop | [k] |
Voiced Velar Stop | [g] |
Glottal Stop | [ʔ] |
Voiceless Uvular Stop | [q] |
Voiced Uvular Stop | [G] |
Ejectives | [p', t', k', q'] |
Implosives | [ɓ, ɗ, ʄ, ɠ] |
Fricatives
Description | IPA Symbol(s) |
Bilabial Fricative | [ɸ] |
Voiceless Labiodental Fricative | [f] |
Voiced Labiodental Fricative | [v] |
Voiceless Dental Fricative | [θ] |
Voiced Dental Fricative | [ð] |
Voiceless Alveolar Fricative (Sibilant) | [s] |
Voiced Alveolar Fricative (Sibilant) | [z] |
Voiceless Postalveolar Fricative (Sibilant) | [ʃ] |
Voiced Postalveolar Fricative (Sibilant) | [ʒ] |
Voiceless Palatal Fricative | [ç] |
Voiceless Velar Fricative | [x] |
Voiced Velar Fricative | [ɣ] |
Voiceless Glottal Fricative | [h] |
Voiced Glottal Fricative | [ɦ] |
Voiceless Lateral Fricative | [ɬ] |
Voiced Lateral Fricative | [ɮ] |
Affricates
Description | IPA Symbol(s) |
Voiceless Postalveolar Affricate | [tʃ] |
Voiced Postalveolar Affricate | [dʒ] |
Voiceless Alveolar Affricate | [ts] |
Voiceless Bilabial Affricate | [pf] |
Voiceless Lateral Affricate | [tɬ] |
Voiced Lateral Affricate | [dɮ] |
Nasals
Description | IPA Symbol(s) |
Bilabial Nasal | [m] |
Labiodental Nasal | [ɱ] |
Alveolar Nasal | [n] |
Palatal Nasal | [ɲ] |
Velar Nasal | [ŋ] |
Uvular Nasal | [ɴ] |
Liquids
Description | IPA Symbol(s) |
Bilabial Trill | [ʙ] |
Alveolar Trill | [r] |
Uvular Trill | [ʀ] |
Alveolar Tap/Flap | [ɾ] |
Retroflex Tap/Flap | [ɽ] |
Dental/Alveolar Approximant | [ɹ] |
Palatal Approximant | [j] |
Retroflex Approximant | [ɻ] |
Dental/Alveolar Lateral Approximant | [l] |
Velar Lateral Approximant | [ʟ] |
Other Sounds
Description | IPA Symbol(s) |
Palatal-labial Approximant | [ɥ] |
Velar Approximant | [ɰ] |
Labial-velar Approximant | [w] |
Clicks | [ǀ, ǃ, ǂ, ǁ] |
Why not a lot of languages have /g/
Not a lot of languages have the voiced velar stop /g/ because of the Aerodynamic Voicing Constraint. To maintain voicing, the air pressure below the vocal folds must be greater than the pressure above them. For a stop like /g/, which is produced with a closure at the velum, the oral cavity is very small. As air is forced into this small, closed space, the pressure builds up quickly, eliminating the necessary pressure difference for the vocal folds to continue vibrating. This makes it difficult to produce a fully voiced velar stop. In contrast, stops at the lips (/b/) or the alveolar ridge (/d/) have a larger oral cavity in front of the closure, which can expand slightly to lower the pressure and help maintain voicing.
Diphthongs
Based on the provided handouts, here's a study guide entry on diphthongs.
A diphthong is a "dynamic" vowel whose quality changes over time. While they are sometimes described as having two distinct targets, it is perhaps more accurate to think of them as specified trajectories across the vowel space. A diphthong is most often comprised of a steady-state component followed by a dynamic or gliding component.
Cross-Linguistic Variation and Diphthongization
Languages can differ significantly in the timing of the components of a diphthong and in the acoustic distance actually covered along similar trajectories. For instance, a graph shows the acoustic distance and transition percentages for /aɪ/ and /aʊ/ diphthongs in English, Chinese, Hausa, and Arabic. English has a very high transition percentage and large acoustic distance for these diphthongs, while Hausa and Arabic show a much smaller distance. This demonstrates how phonologically identical diphthongs can be realized very differently in various languages.
Diphthong vs. Vowel + Glide Cluster
It is not always possible to use purely phonetic evidence to determine if a string of sounds is a diphthong or a sequence of a vowel plus a glide. For example, the handouts contrast a diphthong like [aɪ] with a consonant cluster like [CjV] in English.
To decide between these two analyses, one might look for sources of evidence such as:
The phonetic characteristics of the sequence itself.
The phonetic characteristics of adjacent segments, such as vowel duration.
Phonological patterning evidence.
Psycholinguistic evidence, like how the sounds behave in speech errors or language games.English Diphthongs
English has several diphthongs that are commonly cited, which can be categorized based on their glide direction.
Centering Diphthongs: These glide towards the central vowel schwa
[ə].[ɪə]as in "near"[eə]as in "square"[ʊə]as in "cure"
Closing Diphthongs: These glide towards a high vowel
[ɪ]or[ʊ].[aɪ]as in "buy"[aʊ]as in "how"[ɔɪ]as in "boy"[eɪ]as in "say"[oʊ]as in "go"
Diphthongs vs. Monophthongs
The provided handouts highlight the difference between diphthongs and monophthongs. Monophthongs are "steady-state" vowels, while diphthongs are "dynamic," with the quality changing over time. Diphthongs are often described as having a steady-state component followed by a dynamic or gliding component.
Diphthongs vs. Consonant Clusters
Distinguishing a diphthong from a vowel followed by a glide (
[Vj]or[Vw]) can be tricky. Purely phonetic evidence is not always sufficient to make this distinction. To determine if a given string of sounds should be analyzed as a single diphthong or a vowel + glide cluster, phonological considerations often end up being primary. This is also true for other complex segments like affricates.For example, in English,
[ju]is described as a "rising diphthong" but can be contrasted with a consonant cluster[CjV]or[CwV].Psycholinguistic evidence, such as how the sounds behave in speech errors (Spoonerisms) or language games, can also be used to determine if the two halves of a diphthong behave as a single unit or independently.
Cross-Linguistic Differences
Languages vary significantly in how they produce diphthongs, including the timing of their components and the acoustic distance covered. For example, a graph in the handouts compares the
[aɪ]and[aʊ]diphthongs in Arabic, Chinese, Hausa, and English, showing a large variation in acoustic distance and transition percentages. The document also lists numerous contrastive diphthongs in Finnish, such as[ie],[yö], and[uo]
Vowel Narrow Transcription Rules
These rules are for transcribing the phonetic details of vowels in American English.
Rule 1: Nasalization 👃 Vowels become nasalized when they appear before a nasal consonant (e.g.,
[m],[n],[ŋ]). This is represented by adding a tilde[̃]above the vowel symbol.Rule 2: Diphthongization All tense vowels are transcribed as diphthongs. Tense vowels in English include
[i](as in bee),[e](as in bay),[u](as in boo), and[o](as in boat).Rule 3: Vowel Lengthening ⏳ Vowels are lengthened in two specific environments:
In an open syllable (a syllable that ends in a vowel sound).
Before a voiced consonant.
Lengthening is indicated by a colon
[ː]placed after the vowel. The handouts note that lax vowels (except for /ɔ/) do not typically appear in open syllables.Rule 4: Vowel Deletion Stressless vowels are often deleted, especially when they come before a sonorant consonant (e.g.,
[m],[n],[l],[ɹ]).Rule 5: Rhotacization Mid-central vowels are rhotacized, which means they have an
r-like quality.In stressed syllables, the symbol
[ɜ˞]is used.In unstressed syllables, the symbol
[ɚ]is used.
Rule 6: Vowel Tense/Lax Neutralization The distinction between tense and lax vowels is not made when the vowel is immediately followed by the American
[ɹ].
Consonant Narrow Transcription Rules
These rules detail the phonetic features of consonants in American English.
Rule 1: Aspiration Voiceless stops (
[p],[t],[k]) are aspirated at the beginning of a word or a stressed syllable. Aspiration is shown with a superscript[ʰ].Rule 2: Sonorant Devoicing Sonorant consonants are devoiced when they follow an initial voiceless stop.
Rule 3: Stop & Fricative Devoicing Voiced stops (
[b],[d],[g]) and fricatives ([v],[ð],[z],[ʒ]) are devoiced if they are not surrounded by two voiced sounds.Rule 4: Creaky Voice Sonorants Creaky voice occurs on sonorants when they are immediately followed by a glottal stop
[ʔ].Rule 5: /h/ Airstream The consonant
[h]is subject to two rules:It is pronounced with breathy voice when it is intervocalic (between two vowels) and at the beginning of a stressed syllable.
It is deleted when it is intervocalic and at the beginning of an unstressed syllable.
Rule 6: No Audible Release Stops do not have an audible release when they come before another consonant.
Rule 7: Consonant Cluster Deletion In a three-consonant cluster, the second consonant is deleted if the cluster does not contain
[ɹ].Rule 8: Nasal Assimilation The place of articulation of a nasal consonant assimilates to the place of the sound that follows it.
Rule 9: Dental/Postalveolar Stops
The alveolar stops (
[t],[d]) and nasal ([n]) become dental when they precede another dental consonant.The alveolar stops (
[t],[d]) become postalveolar when they precede[ɹ].
Rule 10: Glottal Stop for /t/ before /n/ The consonant
[t]becomes a glottal stop[ʔ]when it is followed by[n].Rule 11: Flapping The alveolar stops
[t]and[d]become a flap ([ɾ]) when they are intervocalic and precede an unstressed syllable.Rule 12: Light vs. Dark /l/
Light
[l]is used when the consonant is syllable-initial.Dark
[ɫ]is used when the consonant is syllable-final.
Rule 13: American
[ɹ]The American[ɹ](a glide-like approximant) is used instead of the trill[r]