20d ago

Speech Perception and Acoustic Processing

Speech Perception Notes

Speech Perception Process

  • Conceptual, Syntactic, Orthographic Codes: Essential for understanding language.

  • Lexical Selection & Retrieval: Involves choosing the correct word from memory during speech.

  • Prelexical Code: Transition from sound signal to language units.

  • Phonetic Decoding: Breaks down acoustic signals into phonemes.

  • Acoustic Input: Raw sound that initiates speech perception.


Continuous Speech Signal

  • Example: "Mushrooms are an edible fungus."

  • Spectrogram: A visual representation that shows sound amplitude across different frequencies over time.

  • The spectrogram illustrates how continuous speech doesn't have clear boundaries or gaps between words.


Challenges in Speech Perception

  • Segmentation: Speech is continuous; linguistic representation is discrete.

  • Variability: Phonemes can vary by speaker, context, and accents, making them challenging to identify accurately.

    • Example: The only silence in “ago” is the /g/ sound.


Acoustic Properties of Speech

  • Phoneme: The smallest sound unit in a language; meaningful differences.

    • Examples:

      • /b/ in "bad" vs. /p/ in "pad"

      • /d/ in "bad" vs. /t/ in "bat"

  • Phones: Individual sounds; a phoneme consists of all its phones, treated the same.

  • Allophones: Phones that are variations of the same phoneme in different contexts.

    • Example: In English, [k] and [q] are allophones of /k/.


Structure of the Vocal Tract

  • Anatomical Features:

    • Sphenoidal Sinus, Nasal Meatuses, Pharyngeal Tonsil, Uvula, etc.

    • Components of the Ear: Malleus, Incus, Stapes (ossicles), Tympanum (eardrum), Cochlea (inner ear).


Acoustic Properties of Sounds

  • Amplitude: Height of sound waves - relates to loudness.

  • Frequency: Cycles per second - relates to pitch. Low frequency = high pitch; high frequency = low pitch.

  • Harmonics: Higher frequencies that occur at specific intervals of the fundamental frequency, determining timbre.

    • Example: Middle C (262 Hz) with its first harmonic at 524 Hz, etc.


Source-Filter Theory of Speech Production

  • Sources: Vocal folds generate sound.

  • Filters: The vocal tract shapes the sounds generated by the source, modifying them to produce different phonemes.

    • Resonance: Vocal tract shape affects which frequencies are amplified. Formants are enhanced frequency bands (usually 3-4 formants in speech).


Cues in Speech Perception

  • Formant Transitions: Changes in frequencies that occur before or after stops.

  • Categorical Perception: Refers to our ability to discriminate between categories of speech sounds, rather than variations within categories (e.g., voice onset time).

  • Top-Down Processing: Contextual cues that help listeners process speech even when parts are missing or mispronounced.

    • Example: Phoneme Restoration Effect shows that we can perceive words even when phonemes are obscured.


Neural Mechanisms and Models

  • Dual-Stream Model (Hickok and Poeppel): Different auditory regions process information for speech comprehension and production.

  • TRACE Model: An interactive activation model that explains how features, phonemes, and words are processed in a layered manner, allowing top-down and bottom-up influences.


Co-Articulation and Context Effects

  • Co-articulation refers to how phonetic segments overlap, impacting speech perception positively and negatively.

  • Plasticity and Age of Acquisition: Neuroplastic changes occur based on language exposure and adaptability.


Final Notes

  • Speech perception is a complex interplay of acoustic signals, cognitive processes, and contextual factors, supported by a biological basis for integration between perception and production.


knowt logo

Speech Perception and Acoustic Processing

Speech Perception Notes

Speech Perception Process

  • Conceptual, Syntactic, Orthographic Codes: Essential for understanding language.
  • Lexical Selection & Retrieval: Involves choosing the correct word from memory during speech.
  • Prelexical Code: Transition from sound signal to language units.
  • Phonetic Decoding: Breaks down acoustic signals into phonemes.
  • Acoustic Input: Raw sound that initiates speech perception.

Continuous Speech Signal

  • Example: "Mushrooms are an edible fungus."
  • Spectrogram: A visual representation that shows sound amplitude across different frequencies over time.
  • The spectrogram illustrates how continuous speech doesn't have clear boundaries or gaps between words.

Challenges in Speech Perception

  • Segmentation: Speech is continuous; linguistic representation is discrete.
  • Variability: Phonemes can vary by speaker, context, and accents, making them challenging to identify accurately.
    • Example: The only silence in “ago” is the /g/ sound.

Acoustic Properties of Speech

  • Phoneme: The smallest sound unit in a language; meaningful differences.
    • Examples:
      • /b/ in "bad" vs. /p/ in "pad"
      • /d/ in "bad" vs. /t/ in "bat"
  • Phones: Individual sounds; a phoneme consists of all its phones, treated the same.
  • Allophones: Phones that are variations of the same phoneme in different contexts.
    • Example: In English, [k] and [q] are allophones of /k/.

Structure of the Vocal Tract

  • Anatomical Features:
    • Sphenoidal Sinus, Nasal Meatuses, Pharyngeal Tonsil, Uvula, etc.
    • Components of the Ear: Malleus, Incus, Stapes (ossicles), Tympanum (eardrum), Cochlea (inner ear).

Acoustic Properties of Sounds

  • Amplitude: Height of sound waves - relates to loudness.
  • Frequency: Cycles per second - relates to pitch. Low frequency = high pitch; high frequency = low pitch.
  • Harmonics: Higher frequencies that occur at specific intervals of the fundamental frequency, determining timbre.
    • Example: Middle C (262 Hz) with its first harmonic at 524 Hz, etc.

Source-Filter Theory of Speech Production

  • Sources: Vocal folds generate sound.
  • Filters: The vocal tract shapes the sounds generated by the source, modifying them to produce different phonemes.
    • Resonance: Vocal tract shape affects which frequencies are amplified. Formants are enhanced frequency bands (usually 3-4 formants in speech).

Cues in Speech Perception

  • Formant Transitions: Changes in frequencies that occur before or after stops.
  • Categorical Perception: Refers to our ability to discriminate between categories of speech sounds, rather than variations within categories (e.g., voice onset time).
  • Top-Down Processing: Contextual cues that help listeners process speech even when parts are missing or mispronounced.
    • Example: Phoneme Restoration Effect shows that we can perceive words even when phonemes are obscured.

Neural Mechanisms and Models

  • Dual-Stream Model (Hickok and Poeppel): Different auditory regions process information for speech comprehension and production.
  • TRACE Model: An interactive activation model that explains how features, phonemes, and words are processed in a layered manner, allowing top-down and bottom-up influences.

Co-Articulation and Context Effects

  • Co-articulation refers to how phonetic segments overlap, impacting speech perception positively and negatively.
  • Plasticity and Age of Acquisition: Neuroplastic changes occur based on language exposure and adaptability.

Final Notes

  • Speech perception is a complex interplay of acoustic signals, cognitive processes, and contextual factors, supported by a biological basis for integration between perception and production.