Speech Perception and Speech Theories
- phonetics: motor aspect; use brackets [ ]
- ex: [ b ]
- phonemics: deals with role of sound in culture/phonology; use slanted lines / /
- ex: ball /IPA form of b a l/
- speech perception: means by which acoustic, visual, or tactile signals are mapped onto the language forms (syntax, morphology, etc.)
# Speech Perception
## Infants
- perception begins as soon as hearing begins
- discriminate at phoneme level
- phonemes have no meaning but are important in devt of speech
- beginning to become sensitive to a variety of higher level linguistic structures
- 0-5 is the "sponge era"; crucial part of language devt
## Speech Acoustics
- acoustic properties of speech, such as intonation, intensity and speaking rate
- aka suprasegmentals
- spectrogram: represents graphically frequency over a specific time interval
### vowel characteristics
- tongue height
- mouth/tongue openness
- lip rounding
- tenseness
## characteristics of consonants
u know this
- manner
- placement
- voicing
# Theories of Speech Perception
## Issues
- Linearity: stresses link between specific sound with corresponding phoneme
- Segmentation: speech signal is divided into discrete units
- Speaker Normalization: we are able to understand sounds pronounced differently by different people, regardless of accent, age, sex, pitch
- Basic unit of perception: how do we perceive sound? words, syllables, phonemes, allophones #searchuplater , etc.?
## Categories
- Active: stresses link between perception and production
- Passive: sensory aspects of speech perception
- Bottom up: acoustic signal contains all info needed for recognition of sound (hearing the word, then adding meaning)
- Top down: higher level linguistic/cognitive operations (we perceive linguistic information, then auditory)
- Autonomous: signal is processed serially (step by step)
- Interactive: analyze acoustic signal as a whole
## Theories
- Motor theory: links speech perception and production
- can identify because can produce that sound
- Acoustic Invariance Theory: each phoneme has its own features
- ex: /p/ sound can be seen because lips are being used and plosive is heard
- Direct Realism: Interactive theory; perceive the signal as a whole, whether it be a phoneme or a word
- TRACE model: both top down and bottom up; analyzes acoustic and linguistic info at the same time
- Logogen theory: Interactive theory; provides meaning and context to a word once its heard
- ex: hearing dog gives you the picture of dog, plus you know how to say dog
- Cohort Theory:
- Autonomous stage: start of the word gives a list of all possible
- ex: "a___" could be apple, alike, amaze
- Interactive stage: eliminates all improper words in context
- Fuzzy logical model of perception theory
- fuzzy: is there a sound or not (0 no sound .5 ambiguous 1 present sound)
- prototype: compares sound heard to database of phonemes in ones mind
- Pattern classification: determines the best match between heard sound and
- Native Language Magnet theory
- "magnet" because we pull most familiar
- sounds in our database
- begins in early infancy
- first 10 months
- able to discriminate
- 10-11 months
- phonemes are prioritized based on what is frequently heard in child's language
# Acoustics Model of Speech Production
## Acoustic theory of vowel production (source filter theory)
- Source
- voice is made in larynx
- Filter
- Vocal tract acts as filter
- 3 Elements
- glottal sound
- vocal tract resonator
- sound at the lips
- Conditions
- source:
- vibrator
- transmitting medium
## Larynx
- control flow of air
- protection
- swallowing
- abdominal fixation
### periods
- quasi periodic signal: has a periodic pattern, but has variations in period and amplitude
### vocal tracts
- cavities act as natural resonators
- some cavities have anti resonant properties
## Acoustic Characteristics
- physical characteristics: ya need PRAAT ma boi
- perceptual characteristics: what we hear
| Physical | Perceptual |
| ---------------- | ------------------------------ |
| frequency | pitch |
| intensithy | loudness |
| spectrum | quality |
| duration or rate | perception of duration or rate |
## Fundamental frequencies
- number of oscillations per second
- called "F0"
- lowest frequency produced by a system
### formants
- used to differentiate vowels
- F1:
- determined by the volume of pharyngeal cavity
- associated with tongue height
- inverse relationship
- F2
- length of oral cavity
- associated with tongue advancement
- direct relationship
- Roundedness
- rounded has lower f2
- unrounded has higher f2
# Spectrogram
- PRAAT ma boii
- no specific numbers, but there's a range
- Periodic V Aperiodic
- periodic: there's a pattern
- aperiodic: no pattern
- consonants are aoeriodic
- vowels are periodic
- diphthong: transition between 2 vowels
## Consonants
- MANNER
- voice bar if voiced consonant
- Stop
- there's a big gap in between
- Fricative
- turbulent look;
- Affricative
- combo of stop and fricative
- "ch" #searchuplater
- nasals
- looks like a vowel a bit
- more faded vowel
- #thoughts i think it needs a vowel to compare to so you can know if its more faded than the vowel
- Approximants
- brining articuators together without friction
- "semivowels"
- there's a dip
- Trills
- lots of gaps; easy to remember
- PLACEMENT
- VOICING
# Segmentals and Suprasegmentals
## Suprasegmentals
- Stress: pointer for emphasis
- Intonation: rise and fall of pitch
- Duration: "ice vs eyes"
- Pausing
# Distinctive features
## Vowels
- sonorant sounds: spontaneous voicing
- consonantal sounds: closed vocal tract
- vocalic sounds: open vocal tract
## Consonants
- consonantal sounds
- vocalic sounds
- sonorant
- continuant: complete blockage of tract during production
- strident: fricative and affricatives; generate intense noise (basically annoying sounds)
- ex: /s/ /z/
- stop
- closure: with a stop gap
- voiceless: stop gap is silent
- voiced: stop gap has low freq band of energy
- release: burst of sound
- transition: moving toward production of another sound
- nasal
- nasal murmur: acoustic segment associated with nasal radiation of sound energy
- murmur has spectrum dominated by low frequencies with a prominence around 250Hz
- fricatives
- aperiodic sound waves
- #thoughts wow maybe i should participate in class to keep my awake
- affricatives
- approximants
- glides: vowel like
- w is similar to u. f3 and f4 are weak due to oral closure
- liquids
- differences in tongue tip (coronal) config are reflected in f2 but most obviously in f3
- glottal
- airflow is turbulent
- glottis wide open