Lecture 11 - Voice Perception

Voice Perception

Presenter: Alice Milne, University of LancasterContact: a.milne1@lancaster.ac.uk

Aims of the Session

Understanding Voice Information: Explore the multifaceted types of information conveyed through the human voice and the cognitive and physiological mechanisms that support these processes.
Perception of Emotions and Social Cues: Investigate how we derive emotional and social insights from voices and the implications these perceptions have on communication.
Brain Regions in Voice Processing: Identify and discuss the specific brain regions involved in processing voice information and how they function.
Correlational Studies: Utilize MRI technology to study correlations in brain activity associated with voice perception and apply causal techniques such as Transcranial Magnetic Stimulation (TMS) to establish relationships between brain regions and voice processing tasks.

Interactive Exercise

Listening Activity: Participants will close their eyes and listen attentively to a voice, reflecting on the assumptions they make regarding:
- Identity: What can you infer about the speaker's identity based on vocal characteristics?
- Emotions: What emotions does the voice convey?
- Physical Traits: Can you guess any physical characteristics solely from the voice?
- Personality Traits: What aspects of personality can be discerned from vocal cues?

Importance of Voice in Social Interaction

Identification of Humans: Voices are critical in differentiating humans from other non-verbal sounds. This auditory cue is essential for social bonding.
Social Information Extraction: Voices contain vital social information that aids listeners in deciding whether to approach or avoid individuals based on perceived intentions.
Types of Information Conveyed:
- Identity: Recognition of familiar individuals.
- Emotional State: Understanding the speaker's mood and emotional condition.
- Intentions: Assessing credibility and trustworthiness (e.g., are they sincere?).

How Voices Convey Information

Voice Production

The voice is produced in three primary stages:
1. Source (Larynx): This anatomical structure produces sound through vibrating air.
  1. connects lungs and vocal cords, allowing for the generation of sound waves that are then shaped into distinct vocal qualities.
2. Filter (Vocal Tract): The unique shape and size of the vocal tract modifies the sound, contributing to individual voice quality.
  1. The sound waves generated in the larynx travel through the vocal tract, where they are modified by the shape of the mouth, tongue, and lips, contributing to the unique timbre and clarity of the voice.
3. Output: The auditory result of this combination is the voice we perceive.

Physical Characteristics Influencing Voice

Body Size: The size of a body alters vocal tract length, influencing sound frequency and quality.
Gender Differences: Males typically possess larger vocal folds, resulting in a lower pitch, while females often have higher-pitched voices.
Age Variations: Adults generally exhibit lower pitches than children due to larger vocal structures, which can alter during maturation.
- children more high pitch
Puberty Changes: In males, puberty brings about elongation and thickening of the vocal folds, significantly altering voice pitch and quality.
- puberty elongates and thicken the voice

Emotional Impact on Voice

Influence of Emotions: Emotional states such as stress, anger, and excitement affect the physical properties of the voice, including size, thickness, and flexibility of laryngeal muscles, leading to distinct voice outputs.
Physiological Indicators: The physical characteristics of a voice reveal substantial information about the speaker’s emotional and physiological state, making vocal analysis a reliable emotional assessment tool.

Voice Measurement

Voices can be quantitatively assessed through various parameters, including:
- Pitch (Fundamental Frequency, F0)
- Intensity (Amplitude)
- Tempo (Speed/Rhythm)

Vocal Emotions and Their Recognition

Distinct Acoustic Profiles: Different emotional states produce unique acoustic profiles. Research indicates that emotions such as elation, sadness, and grief can be distinguished from one another based on variations in pitch and intensity.
Cross-Cultural Recognition: Emotions like joy, sadness, fear, anger, and disgust are universally recognizable across cultures, suggesting a fundamental biological basis for vocal emotional expression.
Cultural Influences: While some universal recognition exists, cultural factors can also shape how vocal emotions are perceived and interpreted.

Social Trait Recognition from Voices

Social Indicator Studies: Research involving 320 participants showed that social traits could be extracted from brief vocal samples, such as greetings.
Consistency in Perceptions: High agreement rates among participants reveal that certain vocals convey inherent traits like attractiveness or confidence.
we associate voice qualities with social traits on a collective levell, suggesting that our perceptions are influenced by shared cultural norms and experiences.
This indicates that voice perception is not solely an individual experience, but rather a reflection of broader societal patterns that shape our interpretations of vocal characteristics.

Influence of Vocal Traits on Perception

Perception Formation: Voice tone and style have profound effects on perceptions of authority, reliability, and sincerity.
Environmental Context: The auditory environment also plays a role in voice modulation, affecting how messages are conveyed and received in different contexts.

Summary of Key Points

Voices transmit essential communicative information regarding physical, emotional, and social states.
Specialized neural substrates are responsible for the processing of vocal information, with implications for social interactions and emotional understanding.

Neural Mechanisms of Voice Processing

voice processing is associated with temporal areas
- block designs help match for auditory complexity
Temporal Brain Areas: Voice processing is primarily localized in temporal lobe regions, particularly the:
- Superior Temporal Gyrus (STG): This region shows increased activity during voice recognition tasks, enhancing our understanding of vocal identity and emotional content.

Testing Causal Roles in Voice Processing

Neuropsychological Investigations: Studies of brain injuries provide insights into the causal roles of specific brain areas in voice processing. Limitations pertain to small sample sizes and challenges in localizing damage.
- Functional Imaging Studies: Techniques such as fMRI and PET scans have been used to observe brain activity during voice perception tasks, revealing the involvement of various temporal regions and their interactions.
- difficult to find patients so sample size is limited
- difficult to control the location and extents of brain injuries
- neural placity, brain rewires itselfto compensate for cognitive impairment
Transcranial Magnetic Stimulation (TMS): TMS disrupts neuron functioning temporarily, allowing researchers to investigate causal relationships between different brain areas and their roles in voice recognition and processing tasks.
- induces electric currents in brain area
- disrupts functioning of neurins in that area
- test behavioral outcome of stimulating areas
- demonstrates a casual relationships between the ara stimulated and the function

Advanced Studies of Voice Processing

Current research utilizing TMS emphasizes the brain's specialization in tasks related to voice recognition, such as discrimination between voice and non-voice stimuli.
TMS to the right temporal voice area impairs voice recognition
Methods:
- 1) Use fMRI to identify area that responds most to the voice inindividuals – localisation
- 2) Apply TMS to that area
- Voice Task: Voice / Non-voice discrimination
- Control Task: Loud / Quiet discriminationVoice AreaControl are
TMS to individual Voice Areasimpairs accuracy in voice/nonvoice discrimination compared to TMS to the Control Site
Low-level loudness discrimination is not affected by TMS to the VoiceAreas.
Temporal voice areas are specialised for voice eprocessing

Emotional and Social Information Encoded in Voices

Intonation Representation: Emotional intonation manifests in distinct neural activation patterns, which can be mapped to vocal characteristics.
Identity and Emotion Encoding: Different voice features provide information about both the speaker's identity and their emotional state, contributing to effective interpersonal communication.

Advanced Studies of Voice Processing

Current research utilizing TMS (Transcranial Magnetic Stimulation) emphasizes the brain's specialization in tasks related to voice recognition, particularly in distinguishing between voice and non-voice stimuli.

TMS Impact on Voice Recognition: TMS applied to the right temporal voice area significantly impairs voice recognition abilities. This indicates that this area is crucial for processing voice information.

Methods:

fMRI Utilization:
- Researchers first use fMRI (functional Magnetic Resonance Imaging) to identify the brain area that responds most to voices in individuals, which aids in localization of voice processing regions.
TMS Application:
- After localization, TMS is applied to disrupt neuron functioning within the voice-sensitive area.

Voice Task Experiments:

Voice vs. Non-Voice Discrimination:
- The primary task requires participants to discriminate between voice and non-voice stimuli.
Control Task:
- A control task involving loud versus quiet discrimination is employed to compare the effects of TMS on both voice tasks and non-voice tasks.

Results of TMS Application:

Impaired Accuracy:
- TMS applied to individual voice areas results in decreased accuracy in the voice/non-voice discrimination task compared to when TMS is applied to a control site.
Loudness Discrimination:
- Importantly, the ability to discriminate low-level loudness is not affected by TMS to the voice areas, highlighting the specialization of temporal voice areas specifically for voice processing.

right temporal voice responds to angry vocalization

Acoustic Characteristics of Anger:

Anger is often conveyed through specific changes in the acoustic properties of the voice, which can be quantified through various parameters.
Common features associated with anger include increased pitch, intensity (loudness), and tempo (speed of speech).
The pitch may become higher as emotional arousal increases, while intensity can be amplified, making the voice louder and more forceful.

Neural Mechanisms:

Anger recognition may involve increased activation of specific brain regions associated with emotional processing.
Research suggests that the amygdala plays a crucial role in the processing of angry vocalizations, facilitating the immediate emotional response to perceived threats.

Cultural and Contextual Influences:

Recognition of anger in voices can be influenced by cultural factors.
Individuals from different cultural backgrounds may interpret vocal cues related to anger differently based on their social norms and experiences.

Social Implications:

Accurate anger recognition is important for social interactions and communication.
Misinterpretation of an angry voice can lead to conflicts or miscommunication in social contexts.

Cross-Cultural Studies:

Research indicates that while some aspects of anger recognition are universal, the nuances can vary across cultures.
Studies have shown that participants are generally accurate at identifying anger in voices, but the degree of accuracy can fluctuate based on cultural familiarity with the speaker's background.

Emotional Context in Communication:

Anger can serve as a crucial signal during communication, indicating strong feelings that might influence others' responses and actions.
Understanding the dynamics of anger recognition aids in maintaining effective interpersonal relationships and adapting communication styles accordingly.

right interior temporal voice area responds to change in speaker identity

speaker identity - is essential for accurately interpreting emotional cues, as it allows listeners to assess the context and intent behind the message being conveyed.
- neural adaptation
A gradual decrease overtime in the responsiveness of the sensory system to constant stimulus is
This can also be detected with an FMRI as activity is reduced in this area
Some brain regions will adapt to repeating speaker identity in the voice
Some brain regions will adapt to repeating a syllable in the voice
The right anterior temporal voice arena is less active when it's the same speaker and when it is adapting to a identity

Concluding Session Summary

Specialized Brain Areas: Voice perception heavily relies on specialized brain regions known as Temporal Voice Areas.
Distinct Activation Patterns: These areas exhibit unique activation patterns linked to vocal emotions and the identities of speakers. Advancements continue in researching how voice characteristics are encoded and interpreted in the brain, promising further insights into the complexities of human communication and interaction.