KG

Audiovisual Speech Perception and Speechreading Training Notes

Audiovisual Speech Perception and Speechreading Training

  • Lipreading relies solely on visual cues, while speechreading utilizes both auditory and visual signals, including facial expressions and gestures.
  • Even people with normal hearing use speechreading in noisy environments or when watching dubbed films.
  • fMRI studies show the auditory cortex activates when attempting to recognize speech visually.
  • Infants engage in speechreading, visually recognizing mispronounced familiar words.
  • Individuals with hearing loss depend more on visual signals for speech recognition.

Characteristics of a Good Lipreader

  • Lipreading performance varies widely.
  • Intelligence, education, hearing loss duration, gender, and gaze behavior do not predict lipreading ability.
  • Cognitive skills like spatial working memory and processing speed predict lipreading scores, accounting for 46% of the variance in studies.
  • Spatial working memory involves recalling nonverbal items, while processing speed is measured by reaction time tasks.
  • Young adults lipread better than older adults and People with congenital hearing loss lipread better than those with normal hearing.

How Lipreading Works

  • A talker's face provides cues for speech sounds and prosodic patterns.
  • Eyes scan for phonetic and prosodic cues, using visual fixations and saccades.
  • Lipreaders monitor different facial regions based on information sought: upper face for prosody, lower face for phonetic judgments.
  • In unfavorable conditions (noise), lipreaders fixate on the mouth and use a "saccades-toward-mouth" strategy.

Difficulty of Lipreading

  • Most people recognize less than 20% of words lipreading due to:

    • Visibility of sounds: 60% of speech sounds lack visible mouth movement.
    • Rapidity of speech: Speakers produce 4-7 syllables per second, exceeding the eye's processing capacity.
    • Coarticulation and stress: Sounds vary based on phonetic and linguistic context.
    • Visemes and homophenes: Sounds and words look alike on the face.
    • Talker effects: Different mouth movements among talkers.
  • Consonants with bilabial closure, upper teeth to lower lip contact, or tongue tip to upper teeth contact are more visible.

  • Features like voicing are not visible.

  • Vowels are less visibly distinctive but acoustically salient for those with hearing loss.

  • Coarticulation and stress alter sound appearance based on context.

  • Visemes are groups of sounds that look alike (e.g., /p, b, m/).

  • Homophenes are words that look identical on the mouth with 47%-56% of English words being homophenous.

  • Context and grammatical cues help reduce confusion.

How Speechreading Works

  • Speechreading integrates auditory and visual cues to form a unified percept.
  • Audiovisual integration involves the auditory and visual signals being mapped onto a phonetic prototype at the same time, where vision biases phonetic decisions about the auditory signal before the decision is made about what is being heard.
  • The McGurk effect demonstrates obligatory integration of auditory and visual information in speech perception.
  • The ventriloquism illusion shows the brain integrates sensory modalities, perceiving synchronized and integrated percept.
  • Crossmodal enhancement occurs when stimulus presentation in one sensory modality affects the ability in another.
  • Models suggest a distinct stage for audiovisual integration but age impacts lipreading more than integration.
  • The Neighborhood Activation Model (NAM) suggests lexical candidates compete for matching the incoming stimulus.
  • Words belong to both auditory and visual lexical neighborhoods.
  • Simultaneous activation of acoustic and visual lexical neighborhoods winnows members in the intersection as speech unfolds.
  • Residual hearing and multiple factors influence the process.
  • Crossmodal enhancement can be assessed with speech detection paradigms, showing that even a signal that parodies moving lips can affect performance.

Importance of Residual Hearing

  • Even minimal auditory information enhances speechreading.
  • Fundamental frequency changes improve consonant identification.
  • Residual hearing aids in extracting suprasegmental patterns, conveying syllabic structure, word boundaries, syntax, and semantics.
  • A degraded auditory signal may provide segmental information, such as whether a sound is voiced or unvoiced.

Factors Affecting Speechreading

  • Factors include the talker, message, environment, and speechreader.

The Talker

  • Clear speech, characterized by slowed rate and good enunciation, improves intelligibility.

  • Gestures and facial expressions convey prosodic cues.

  • Familiar talkers with similar accents are easier to understand, as opposed to a gender of talker; higher fundamental frequency of female voice is harder for hearing loss to hear than the lower fundamental frequencies associated with male voices.

  • Facial hair impedes speechreading.

The Message

  • Structure, word frequency, similar-looking words, and context affect speechreading.
  • Situational cues facilitate speechreading.

The Speechreading Environment

  • Viewing angle (frontal better than lateral) and distance affect performance.
  • Lighting is crucial; contrast sensitivity impacts benefit.
  • Background noise impairs performance by masking speech and distracting the speechreader with interfering movement.

The Speechreader

  • Lipreading skill and residual hearing improve speechreading.
  • High-frequency hearing loss benefits more from visual cues.
  • Appropriate amplification, eyeglasses, stress levels, fatigue, and attentiveness influence performance.

Assessing Speech Recognition

  • Vision-only and audition-plus-vision tests assess recognition and enhancement.
  • Speechreading enhancement indicates how effectively patients use residual hearing.
  • Testing provides objective evidence of amplification benefits.
  • Visual tests date back to 1913 with Edward Nitchie's film of proverbs.
  • The CUNY Sentences and Iowa Phoneme and Sentence Test are examples of tests for adults and Children’s Audiovisual Enhancement Test (CAVET) and the Children’s Build-A-Sentence Test are tests for children.

Calculations

  • Simple difference score: AV\% \text{ correct} - V\% \text{ correct}
  • Normalized ratio score: \frac{AV\% \text{ correct} - V\% \text{ correct}}{100\% - V\% \text{ correct}}

Traditional Training Methods

  • Early 20th-century programs included Bruhn's Mueller-Walle method (analytic), Nitchie's method (synthetic), Kinze's eclectic method, and Brauckmann's Jena method (mimetic/kinesthetic).

Analytic Training Objectives

  • Discriminate consonant pairs differing in place/voice.
  • Discrimination based on manner/voice.
  • Discriminate consonants using four/six-item response sets.
  • Identify words from familiar vocabulary.

Synthetic Training Objectives

  • Discriminate multi/single-word utterances.

  • Discriminate words with the same syllables.

  • Identify picture illustrations from one-sentence descriptions.

  • Answer related simple questions.

  • Mirror practice and focus on message gist with the help of stories and humorous anecdotes. Also, the link between speech production and perception by lipreading yourself in a mirror.

Speechreading Training Today

  • Modern training combines analytic and synthetic elements.
  • Classes focus on the speechreading process, habits, and rules.
  • Rules to follow include: watch talker's face, say something, ambient light, minimize background noise, location, relax.
    The first rule of speechreading seems like an obvious recommendation, but some people become distracted by watching the talker’s hand gestures or they have a habit of listening with lowered eye gaze, instead of concentrating on the talker’s mouth movements.
  • Computer-based programs like Read My Quips and Seeing and Hearing Speech train comprehension and recognition.
  • The first class of a modern speechreading training program often is informational in nature.
  • There are several ways to improve speechreading; examples include, but are not limited to, assessment of speechreading skills, consideration of the process, reflection of habits and skills, followed rules.

Benefits of Training

  • Efficacy investigations are challenging due to participant heterogeneity and varying methods.
  • Some report improved performance, while others find marginal benefit.
  • Modest improvements of 10-15% are common.
  • Tutored self-instruction is effective.
  • Improvements may stem from better test-taking skills.

Oral Interpreters

  • Oral interpreters silently repeat messages, conveying mood and intent.
  • Certification is available through the Registry of Interpreters for the Deaf (RID).
  • Guidelines include maintaining confidentiality, not changing the message, and avoiding personal opinions.

Case Study: An Exceptional Lipreader

  • Exceptional lipreaders show strategy using mental working memory and filling in/updating misperceived information.

Final Remarks

  • Emphasizes effective strategies, such as managing the environment, to minimize speechreading difficulty.
  • Audiovisual integration research may enhance training protocols.

Terms and Concepts to Remember

  • Speechreading
  • Lipreading
  • Functional magnetic resonance imaging (fMRI)
  • Auditory cortex
  • Spatial working memory
  • Processing speed
  • Visual fixation
  • Saccade
  • Audiovisual integration
  • Ventriloquism illusion
  • Crossmodal enhancement
  • High frequency of usage
  • Low frequency of usage
  • Visual lexical neighborhoods
  • Fundamental frequency
  • Favorable seating
  • Luminance
  • Speechreading enhancement
  • Mimetic
  • Variability in individual skill levels
  • Sound visibility
  • Coarticulation and stress effects
  • Visemes
  • Homophenes
  • Models of audiovisual integration
  • Neighborhood Activation Model (NAM)
  • Variables affecting performance
  • Clear speech
  • Frequency of usage
  • Twentieth-century methods
  • Kinesthetic
  • Class handouts
  • Computerized instruction
  • Oral interpreter
  • Oral transliteration