Lecture 5

Syllable Structure: onset, nucleus, coda

Syllable is a unit that can be broken down into onset + nucleus + coda. In many descriptions, the nucleus is the core sonority (often a vowel or syllabic vowel/consonant). The onset consists of consonants before the nucleus; the coda consists of consonants after the nucleus.
Sigma (σ) is used to denote a syllable in some lectures/transcripts.
Key point: The nucleus (often a vowel) is required; both onset and coda can be omitted in some syllables, though onsets commonly exist.
The rhyme of a syllable is defined as the nucleus + the coda. The term rhyme historically relates to rhythm, but in linguistics it specifically refers to the peak of the syllable (nucleus) plus its following sounds (coda).
Syllabification focuses on sounds rather than alphabet letters.
Examples:
- The word I: nucleus only (single syllable, no onset, no coda).
- The word sat: onset = s, nucleus = a, coda = t.
- The word hay: onset = h, nucleus = a (or a diphthong), coda = none (or omitted in some analyses).
- The word out: nucleus = diphthong /aʊ/ (as diphthong), onset omitted, coda = t.
- The word bake: onset = b, nucleus = /eɪ/ (a diphthong), coda = k.
Diphthongs: nucleus can be a diphthong (two sounds in the nucleus, e.g., /eɪ/).
Transcribing syllables helps think about sounds rather than letters.

Complex Onsets, Codas, and Syllable Types

A syllable can have multiple consonants in the onset (complex onset) and/or multiple consonants in the coda (complex coda).
Onsets and codas can be omitted depending on the word, but many words involve at least a nucleus.
The concept of complex clusters is important for analyzing real speech (e.g., spl- in splash).

Sonority and How We Identify Syllables

Sonority = relative loudness of sounds within a syllable.
General ordering (from louder to softer, based on typical conventions in the lecture):
- Vowels are the loudest
- Approximants (e.g., l, r, w, y)
- Nasals (e.g., m, n, ŋ)
- Fricatives (voiced fricatives louder than voiceless)
- Affricates
A voiced fricative is typically louder than a voiceless fricative (e.g., /v/ > /f/ in perceived loudness).
The Sonority Principle: as you move from onset to nucleus to coda, the loudness typically rises to the nucleus and then falls toward the coda.
- Example that violates this in part: splash begins with s (fricative) which is high in onset sonority, then p (stop) which is softer, then rises toward the nucleus and falls toward the coda.
The brain uses multiple cues (including stress, timing, and phonation) to identify syllables.

Stress, Prosody, and IPA Notation

Stress makes a syllable prominent within a word. Syllables can be primary stressed, secondary stressed, or unstressed.
IPA notation for stress:
- Primary stress: a high vertical symbol ˈ placed before the stressed syllable.
- Secondary stress: a low vertical stroke ˌ placed before the syllable.
Acoustic correlates of stress (4 main cues):
- F0 (fundamental frequency / pitch)
- Duration (length of the sound)
- Intensity (loudness)
- Formant pattern (spectral shape across the vowel region)
English stress patterns:
- Trochaic (stressed on the first syllable): e.g., hot dog (noun: HOT-dog; verb meaning may shift in some contexts).
- Iambic (stressed on the second syllable): e.g., the verb form where the second syllable receives primary stress (e.g., phonetician with stress on the third syllable in the given example).
Experimental example: phonetic focus on the word phonetician; primary stress on the third syllable; stress is reinforced by lengthening, loudness, and pitch excursion.
Primary vs secondary stress can change meaning or emphasis within phrases; the same word in different parts of speech can shift stress (e.g., noun vs verb contrasts like record).
Foot: a unit consisting of the stressed syllable and all following unstressed syllables; the concept is used to describe stress patterns and timing in poetry and linguistics.
English tends to be stress-timed (irregular intervals between stressed syllables, with shortened/omitted unstressed syllables), whereas some languages are syllable-timed (equal syllable length) or mora-timed (beats based on sounds, e.g., in Japanese).
Trochaic and iambic patterns are common descriptors for English; other languages show different rhythmic and stress patterns.

Rhythm Types: Stress-Timed, Syllable-Timed, and Mora-Timed

Stress-timed languages (e.g., English): irregular syllable lengths with a rhythm driven by stressed syllables; speakers may truncate unstressed vowels or reduce sounds to align with a timing pattern.
Syllable-timed languages (e.g., Spanish, French): more uniform syllable length, with less truncation of unstressed vowels; rhythm feels more even.
Mora-timed languages (e.g., Japanese): beats are based on mora, not syllables; consonant-vowel sequences and some final consonants count as separate beats, leading to a choppier cadence.
Examples discussed: infatuation (stress-timed tendencies with truncation of unstressed vowels), como te llamas (Spanish rhythm example), Kawasaki (Japanese mora-timed rhythm).
The variety in rhythm affects second language acquisition and perception across languages.

Tone, Contour, and Intonation

Tone languages use pitch to distinguish lexical/grammatical meaning within syllables or words (e.g., Thai, Yoruba, many African languages).
Contour tone refers to the movement of pitch over time within a syllable or word (e.g., ma vs ma with different intonation or tone). A single syllable like ma can have different meanings depending on the pitch contour.
In English (not a tone language for lexical meaning), pitch primarily conveys attitude, emotion, or grammatical structure (intonation).
Contour vs tone: contour describes pitch movement in a sequence; tone is phonemic in tone languages.
Intonation over phrases is used to convey attitude, mood, grammatical structures, and new vs. old information.
Examples of pitch contour: rising end to signal a question (uptalk) vs falling end to signal a statement.
Arrows in spectrograms can be used to indicate pitch contour visually.
The bottom line for pitch representation: F0 is the fundamental frequency; harmonics appear as multiples of F0; formants are resonant concentrations shaped by the vocal tract (see Formants section).

Uptalk, Dialects, and Social Variation

Uptalk (rising terminal intonation) is common in some English-speaking regions (e.g., US, Canada) and can affect perceived confidence or authority.
In professional settings (e.g., job interviews), reducing uptalk and ending declaratively can improve perceived confidence and competence.
Dialectal and gender patterns: uptalk varies by dialect and may be more prevalent among some groups (e.g., stereotypes like Valley Girls); some surveys suggest women use uptalk more than men, though there is substantial variation.

Geminates and Consonant Length

Geminates are consonants that are held longer, giving the impression of a longer single consonant (often observed at morpheme boundaries or within words with double consonants).
Example: forerunner vs foreigner – gemination (lengthened /r/) helps distinguish between phonetically similar words.

Vowel Length, Height, and Coarticulation

Vowel height and duration:
- Low vowels are held longer than high vowels; typical difference ~ ext{duration}{ ext{low}} - ext{duration}{ ext{high}}
  ightarrow 20 ext{-}25~ ext{ms}
- The longer duration difference is above the perceptual threshold (just noticeable difference) for duration.
Labials (lips) affect duration: labial sounds tend to lengthen more than other consonants.
Coarticulation: surrounding sounds influence the articulation of a given vowel; longer vowels can shorten before certain consonants and shorten as more syllables are added.
- Example: fame – the vowel before /m/ lengthens; longer sequences tend to shorten adjacent vowels.
- Fruity vs fruityest – longer vowel realizations shorten as more syllables are added (progressively shortening the vowel or diphthong).
Overall speech rate also affects duration and length of sounds; faster speech leads to shorter segments.

Formants, Spectrograms, and Harmonics

Spectrograms (and historically sacrograms) visualize frequency content over time.
F0 (fundamental frequency): the primary pitch source from the vocal folds.
Harmonics: multiples of F0; all natural sounds have harmonics except pure tones (often produced by machines).
Contours and formants in spectrograms:
- Formants are not harmonics; they are resonant energies shaped by the vocal tract (vocal tract resonances F1, F2, F3, etc.).
- Formants can shift with vowel quality (e.g., /i/ vs /a/ vs /ɐ/): energy concentrates around formant bands.
- Anti-formants: troughs in energy due to specific articulatory configurations (e.g., certain vowels or breathy voice).
Breathiness and voice quality: when breathiness is increased, harmonics can become less prominent and formant structure may appear smeared on the spectrogram.
Glottal fry (creaky voice) and aperiodic sounds: lack periodic harmonics; no clear F0; spectrogram shows noise-like patterns rather than clear formant structure.
Vocally healthy vs pathological voices: spectrograms can be used for biofeedback in voice therapy (e.g., to improve closure and reduce breathiness in nodules).

Onsets, Onset Types, and Articulation Details

Onset types show how a word begins:
- Strong onsets (e.g., plosives with clear release) vs glottal onsets (in some contexts a creaky or glottal stop may precede a vowel for emphasis or as an allophonic variant).
- Glottal onset can be used for emphasis and is a feature sometimes described as a cricotic feature (articulatory emphasis).
Examples of onset contrasts: e.g., a strong onset vs a breathy onset (onsets can affect perception of the following vowel).

Practical and Clinical Relevance

The spectrogram and acoustic cues can be used to diagnose and treat voice disorders (e.g., vocal nodules, poor closure) via biofeedback.
Understanding formants and harmonics helps in distinguishing vowel qualities and diagnosing breathiness or obstructed phonation.

Reading a Spectrogram: Worked Examples

Baseball: F0 contour shows a subtle drop on the second syllable; the first syllable is more energetic (larger amplitude) than the second, indicating higher intensity on the first syllable.
Bacon and eggs: The first syllable shows the strongest energy; there is a notable F0 contour across the phrase with relatively flat initial portion and a rise and fall around 'eggs'.
Foreigner vs Forerunner: Gemination is visible as lengthened r in forerunner, making a longer consonantal segment than in foreigner.
Make: The onset R (or the preceding consonant) behavior can illustrate how sound energy and amplitude distribute across the syllable; vowels are typically the loudest part of the nucleus, aligning with the sonority principle.
Observations about the waveform and harmonics:
- The first syllable often shows a larger harmonic energy and larger amplitude than subsequent syllables when stressed.
- The absence or reduction of harmonics in certain segments (e.g., glottal fry or whispered tones) changes the visible energy distribution.

Formants, Vowel Quality, and Anti-Formants: A Quick Look

Vowel quality is shaped by the resonant energies of the vocal tract; formants F1, F2 (and higher) define vowel height and backness.
Anti-formants: regions of reduced energy (notches) that can occur in certain vowel configurations or due to anti-resonance effects.
A shift from /e/ to /a/ changes formant positions (F1 tends to rise when vowels become more open; F2 shifts depending on backness).

Final Remarks: What the Exam Emphasizes

The exam covers all chapters read so far; focus on core concepts: syllable structure, onset/nucleus/coda, rhyme, sonority, stress and its acoustic correlates, English rhythm patterns (trochaic vs iambic), mora/ syllable/timing differences, tone and intonation concepts, uptalk and social variation, gemination, coarticulation effects, formants/harmonics in spectrograms, and practical clinical applications.
The instructor’s study guide will highlight the key topics from the chapters read; material outside of the class discussions may be less essential for the test.