Speech Science Exam M10 and M11

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/82

There's no tags or description

Looks like no tags are added yet.

Last updated 8:02 PM on 3/28/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

83 Terms

New cards

measuring vowel formant frequencies

1.Spectrogram shows measurement of formant frequencies at temporal middle of vowel

2.LPC spectra show formant peaks in spectrum, computed with a 20 ms window centered around the temporal midpoint (shown in the spectrograms)

3.Note correspondence of formant frequencies in the two display types

New cards

F1-F2 plot

•Each phonetic symbol within an ellipse is an F1/F2 coordinate for an individual talker, produced in an /hVd/ frame where V = vowel

•Variability of vowel formant frequencies, for given vowel, is explained by age, sex, and other factors within any category (such as age)

•Within any ellipse, points in the lower left are likely from men, points in the middle are likely from women, and points in the upper right are likely from children

New cards

corner vowel formant frequencies

for men, women, and children

•Effect of sex and age-related differences in vocal tract length on formant frequencies

•Influence of vocal tract length on formant frequencies is not the same for each of the corner vowels (compare point for /u/ to points fo /i/)

New cards

lax vowel formant frequncies

for men, women, and children

•Effect of vocal tract length on lax vowel formant frequencies is generally the same as the effect on corner vowel formant frequencies

New cards

vowel reduction

•Note axis reversal in the two graphs; left-hand graph shows F1/F2 relations in “articulatory format” by placing F2 on x-axis, F1 on y-axis

•Vowel reduction defined as the movement of F1-F2 coordinates in the direction of the “neutral” vowel (F1 ~ 500 Hz, F2 ~ 1500 Hz for adult males, F1 ~ 600 Hz, F2 =1800 for adult females). Neutral vowel is the vocal tract configuration with no constrictions (like /ə/)

•Relative to “null context” (vowels in /hVd/ context), speaking rate, syllable stress, type of speech material (citation form (null context) versus connected speech as in reading), and speech style (clear versus casual) may result in vowel reduction (separately and in combination; dialect and language may also affect patterns of vowel reduction

•A shared factor among all these potential influences on vowel reduction is variation in speaking rate; shorter vowels (faster rate) are often accompanied by vowel reduction

•Study of vowels and their variability is important because 1) vowels contribute in a significant way to speech intelligibility, and 2) delayed vowel development is thought to be one diagnostic marker of developmental apraxia of speech

New cards

intrinsic vowel duration

•is a property of the vowel, as shown in either the case of a following voiceless obstruent (lower curve) or voiced obstruent (upper curve); not the variation in vowel duration across vowel even though the CVC frame is constant. Low vowels as generally longer than high vowels, tense vowels are longer than their lax counterparts.

New cards

extrinsic influences on vowel duration

are variables that affect a vowel duration in a constant context (in this case, the CVC frame shown in the figure). Variables include speaking rate, syllable stress, phonetic context, speaking style, position of the vowel in a multisyllabic word or in an utterance are all extrinsic influences on vowel duration.

New cards

dipthongs

•Well-defined formant structure

•Defined by prominent formant transitions, especially in F2 (see box superimposed on F2 transition of /ɑɪ/)

•Distinguishing features of diphthongs are the extensive and rapid. Extensive means covering a large range of frequencies, rapid means the transition has a steep slope

•Separate phonemic category compared with vowels: not “two vowels connected by movement”

New cards

dipthongs in F1-F2 space

•Diphthongs typically do not begin or end at the F1-F2 coordinate for the vowels (example, /ɔɪ/, compare red-circled onsets and offset to the black-circled vowels; red arrows show distance between diphthong onset and /ɔ/, and diphthong offset and /ɪ/)

New cards

nasal murmurs

•Murmurs at all places of articulation have low-frequency F1_noriginating in the pharyngo-nasal cavity, and higher formants from the same cavity

•Murmurs have an antiresonance originating in the closed oral cavity, and antiresonances originating in the closed sinus cavities connected to the nasal passageways

•Effect of antiresonances is to reduce energy at and around the frequency of the antiresonance

•Nasal murmurs are much less intense than surrounding vowels (as shown in spectrographic comparisons of /i/ vs. /m/ and /ɑ/ vs. /m/

New cards

semivowels

•like vowels, nasals, and diphthongs, have a well-defined formant structure

•_ have a brief constriction interval (marked by the black bars) in which formant frequencies are relatively stable

•_ have extensive F2 and F3 transitions into and out of the constriction interval

•The acoustics of semivowels have complex, underlying articulatory causes, which may explain (in part) why they are mastered relatively late in children’s sound development

New cards

fricatives

•Spectrogram shows difference between aperiodicity of fricative and periodicity of surrounding vowels

•LPC spectra show differences between sibilants and non-sibilants (compare dark to light spectra in each pair of spectra)

•LPC spectra show frequency difference in energy concentration for alveolar (/s/) vs. palato-alveolar /ʃ/ fricatives (higher frequency concentration for /s/, lower frequency concentration for /ʃ/). These differences reflect the size of the front cavity in the two fricatives

New cards

/h/ acoustics

•/h/ is a segment usually showing both aperiodic (resulting from turbulent flow at narrowed glottis and/or at edges of ventricular folds or epiglottis) and periodic energy (resulting from weak vibration of the vocal folds)

•Energy in the /h/ interval is usually concentrated at formant locations of surrounding vowels (see circles where /h/ formants are continuous with vowel formants

New cards

acoustics of stop consonants

•Closure intervals marked by horizontal lines; voiceless closures typically have no energy, voiced closures have voicing energy at the bottom of the spectrogram

•

•Location of bursts shown by vertical lines

•

•Voiceless stops have relatively long friction and aspiration intervals (~ 40-70 ms), voiced stops short frication intervals (< 20 ms).

•Stops preceding stressed vowels, compared with following stressed vowels, have longer closure intervals, more intense burst and frication intervals

••Voiced stop burst and frication intervals are less intense than voiceless stop burst and frication intervals

New cards

summary of voice onset time (VOT) data

•In English, VOT boundary for voiced vs. voiceless stops is 20-25 ms (vertical dashed line). A notable exception is when voiceless stops are the second segment of an s+stop cluster (ˈsCV)

•

•VOT for voiced stops in the utterance-initial position can have negative values, meaning that glottal pulsing begins before the burst (during the closure interval)

•Variables such as position in stress (pre- vs post-stressed), rate, phonetic context, and speaking style primarily affect voiceless VOTs

•VOT varies somewhat by place of articulation, with VOT increasing in the order bilabial, lingua-alveolar, dorsal (also called velar)

New cards

burst spectra for stop consonant place of articulation

•FFT spectra computed from 20 ms “window” extending from the burst

•/p/ burst spectrum: “diffuse falling”

•/t/ burst spectrum: “diffuse rising”

•/k/ burst spectrum: “compact”

•Burst spectra for voiced stops are essentially the same as voiceless stops, but with lesser intensity and more energy in the low frequencies due to voicing

New cards

blumstein and steven stop-burst spectral templates

•Templates allow for some variability in burst spectrum, but the gross spectral shape is retained; an essential cause of this variability is coarticulation (see inset, both spectra are diffuse rising even with local differences due to different vowels)

•Templates were successful, but not perfectly so, in matching each of the three places of stop articulation in American English

•The success of template matches to each of the three stop places suggests there is sufficient acoustic stability for reliable human identification of place from the speech acoustic signal

New cards

rise time in acoustic terms

1. Cues in the envelope, how long does envelope go from 0 to full volume

New cards

rise time and distinguishing between at least 2 manners of articulation

See how 1 sound goes higher/lower and same with other manner,

New cards

modulation depth and acoustic representation

? 0 to full volume, greater depth greater amplitude, stronger constriction when depth is greater, power

New cards

greater modulation depth and vocal tract constritction

Vocal tract is completely constricted and together

New cards

stop consonant and acoustic feature

? silent stop gap

New cards

if silent stop gap is removed

Wouldn’t see rise, flaccid, not enough power to build up pressure from the constriction,

New cards

rapid vertical line in waveform

Burst release of consonant sound

New cards

burst release and physiological occurence

Pressure build up and then release

New cards

2 acoustic landmarks and voice onset time

Release of stop consonant, onset of voice

New cards

VOT important for distinguishing consonants

Measure that precise timing in waveform, tell differences between consonants and waveforms

New cards

amplitude envelope difference in stop and fricative

? Different amounts of airflow used, different ways the airflow is shaped,
fricative- long smooth amplitude rise time,
stop- silent period and then immediate vertical line

New cards

airflow in fricative and stop

? Type of constriction that airflow has

New cards

2 acoustic features make affricate a combination sound

Rapid onset and then long period of noise

New cards

duration help distinguish affricate from fricative

shorter in duration, more rapid onset time

New cards

nasal sound key characteristics

Sudden decrease in volume, lack of upper formant energy in spectrogram, low-frequency energy, presence of antiresonance

New cards

frequency energy and nasals

high-frequency energy is reduced because some energy is trapped in the mouth

New cards

acoustic cue for place of articulation identification

Frequency cues such as formant transitions, different pattern in frequency, change different shapes of vocal tract, see different patterns in frequencies

New cards

waveform and place of articulation

See change in resonant frequency , hard to tell place of articulation, understand Is changes to resonant frequencies

New cards

/s/ and /ʃ/ differ acoustically in spectrogram

has energy at lower frequency,

has energy at higher frequency, different spectral peaks

New cards

/s/ /ʃ/ and place of articulation

Different points of constriction of the place in your mouth

New cards

2 acoustic cues in a voiced consonant

voicing bar, periodic wave form – doesn’t have noisy representation

New cards

acoustic cues and voiceless sounds

Doesn’t have noise representation, periodic wave form, no voicing bar

New cards

key acoustic signature of /r/

third formant, variation depending on vowel that follows behind it

New cards

use a spectrogram to identify /r/

Because formants are only shown on a spectrogram

New cards

multiple acoustic cues to identify consonatns

no single acoustic figure is fully accurate

New cards

consonants are unique acoustically

involve obstruction or constriction of airflow

less energy, more complex patterns than vowels

New cards

acoustic signal reflects

noise, silenc, transitions

New cards

3 key dimensions

manner of articulation

place of articulation

voicing

New cards

stops acoustic feature

silence (stop gap) + burst

New cards

fricatives acoustic features

aperiodic noise- a sound or signal that does not repeat its wave pattern at regular intervals, lacking a consistent, predictable, or periodic structure. It is characterized by random vibrations, a broad range of frequencies, and irregular changes in intensity over time.

New cards

affricates acoustic features

stop + fricative combo

New cards

VOT (voice onset time)

time between release of stop and voicing onset

distinguishes /p/ vs. /b/, /t/ vs. /d/

clinical note: critical for intelligibility- understanding

New cards

stop gap and burst

stop gap- silence before release

burst- brief noise at release

New cards

fricatives and noise

high-frequency energy (eg /s/)

lower-frequency noise (eg /f/)

New cards

formant transittions

resonant frequencies of vocal tract (F1, F2, F3)

rapid changes into vowels

provide place of articulation info

GPS directions= tell your brain where sound is coming from the mouth

New cards

risk time and modulation depth

rise time- speed of amplitude increase

modulation depth- variation in amplitude

important for perception and clarity

New cards

silent→ burst→ delayed voicing (what type of sound)

voiceless stop (/p,/t/,/k/)

New cards

continuous high-frequency noise (what type of sound)

fricative (/s/)

New cards

low-frequency energy + nasal resonance (what type of sound)

nasal (/m/, /n/)

New cards

key cues

VOT

stop cap

noise

transitions

these cues→ speech perception + clinical diagnosis

New cards

frequency

the measurement of how rapidly sound waves oscillate—or vibrate—between high and low pressure

New cards

Hz (hertz)

which represents cycles per second.

New cards

formants

specific frequency bands that are amplified by the resonance of the vocal tract (throat, mouth, and nasal cavities) during speech or singin

promine

New cards

harmonics

a series of horizontal lines or bands that represent the integer multiples of a sound's fundamental frequency

They appear as horizontal, often parallel, lines. If the pitch (fundamental frequency) is constant, the lines are straight; if the pitch changes (e.g., singing), the lines follow that movement

New cards

periodicity vs noise

Periodicity- signals with predictable, repeating patterns (harmonics) over time or space, such as a musical note or a clock

Noise- unpredictable, random, and lacks a consistent structure, often obscuring the signal.

New cards

generation of acoustic signal

when a vibrating object creates pressure variations (compressions and rarefactions) in a medium—such as air, water, or solids—propagating as sound waves.

New cards

time (x- axis) (spectrogram)

Moves from left to right, showing how the signal changes over time

duration of signal

New cards

frequency (y-axis) (spectrogram)

Represents the pitch or rate of vibration, with lower frequencies at the bottom and higher frequencies at the top.

New cards

amplitude/intensity (spectrogram)

Indicates the loudness or energy of a particular frequency.

Darker or "warmer" colors (red/yellow) represent higher energy

, while lighter or "cooler" colors (blue/green) represent lower energy.

New cards

wideband spectrogram

A type of spectrogram that provides good time resolution (sharp vertical lines for timing), ideal for seeing formants and speech timing.

New cards

narrowband spectrogram

A type of spectrogram that provides high frequency resolution (sharp horizontal lines), ideal for seeing individual harmonics.

New cards

transitions (spectrogram)

Rapid changes in formant frequencies, usually indicating movement from a consonant to a vowel.

New cards

silence (spectrogram)

Indicated by a white or blank space on the spectrogram, often seen during the gap of a plosive consonant.

New cards

fricative noise (spectrogram)

Appears as random, chaotic static, representing chaotic airflow (e.g., s, f).

New cards

voicing (spectrogram)

Dark, often vertical striations at the bottom of the spectrogram, indicating vocal fold vibration.

New cards

sound source theory

consists of two main components:

Source: The sound is generated by the vibration of the vocal folds (vocal cords) when air from the lungs is expelled. This vibration creates a series of pulses that produce sound waves, with the rate of these pulses determining the pitch of the sound.
2
Filter: The vocal tract acts as a filter, altering the sound produced by the vocal folds. The shape of the vocal tract changes during speech, affecting the sound's characteristics and creating different vowel sounds.
2

This theory is fundamental in understanding voice production and is widely used in speech synthesis and analysis.