speech
Introduction to Speech Production
Definition and Purpose
Speech production is the multi-stage cognitive and physical choreography required to transform a non-linguistic idea into an acoustic signal.
It involves:
Conceptualization: Determining what to say (the preverbal message).
Formulation: Translating the message into linguistic form (grammar and phonology).
Articulation: Sending motor commands to the vocal tract to execute speech.
Conceptualization Before Speech
Thinking Process
Individuals select information from their memory and environment to construct a mental representation of their intent.
Examples of Speech Planning:
Highly structured environments like ordering at a restaurant require less complex planning than high-stakes scenarios such as negotiating a salary or explaining complex emotions to a partner.
Complexity of Speech Production
Factors Affecting Speech Initiation
The Bottleneck Effect: We can think much faster than we can speak, leading to a planning queue.
Syntactic Complexity: Sentences with nested clauses or passive voice require longer "lead times" for the brain to organize before the first word is uttered.
Two Primary Dimensions:
Syntactic Structure: Establishing the relationship between the subject, object, and verb (e.g., active vs. passive).
Phonological Elements: Retrieving the segment-by-segment sound structure of words (lexical retrieval).
Distraction and Speech Production
Types of Distraction
Visual-Spatial Processing: Tasks like mental rotation or navigating a map compete for general cognitive resources, leading to increased pauses or simplified syntax.
Verbal Interference: Attempting to talk while reading or listening to other speech causes significant interference, as the "phonological loop" in working memory is overloaded.
Emotional and Contextual Elements
Expression of Emotion through Speech
Prosodic Cues: Stress, pitch, and duration change based on mood (e.g., high pitch often signals excitement or anxiety).
Persona Management: Speakers adjust their register (formal vs. informal) based on the social hierarchy of the listener to manage social standing.
Listener Perception: Anxiety can manifest as "speech disfluency" (fillers like "um" or "uh"), which listeners use as data to judge the speaker's confidence.
Phonetics and Motor System in Speech
Phonetic Production
Speech involves the coordination of over 100 muscles.
Biological Systems:
Respiratory: Controlling airflow from the lungs.
Laryngeal: Producing vibration in the vocal folds.
Supralaryngeal: Shaping sound using the tongue, lips, and soft palate.
Physiological Factors: Colds cause nasal resonance changes, while dental hardware (braces) alters the spatial targets for the tongue tip.
Distinction Between Speech and Language
Speech vs. Language
Speech: The physical medium (sound waves, articulation).
Language: The abstract system of rules (semantics, syntax, morphology) used to communicate.
A person can have a language disorder (aphasia) without a speech disorder, or vice versa (dysarthria).
Visual Representation of Speech Production
Observation of Speech Mechanics
Techniques like Electropalatography (EPG) or Ultrasound show the high-speed contact between the tongue and the roof of the mouth, revealing that speech is a continuous stream rather than isolated sounds.
Tongue Twisters and Speech Errors
Challenges of Tongue Twisters
These capitalize on "phonetic similarity," where the brain's planning for the next sound overlaps with the current sound, causing a neural collision.
Speech Error Statistics
Average speakers make about 1 to 2 errors for every 1000 words. They are not random; they follow the "Phonotactic Constraint Rule," meaning even errors adhere to the rules of the speaker's language.
Types of Speech Errors
Freudian Slip
Classically interpreted as the intrusion of repressed thoughts, though modern psycholinguistics often views them as semantic activation errors.
Detailed Categories:
Semantic Substitution: Replacing a word with another from the same semantic field (e.g., "pass the salt" instead of "pass the pepper").
Word Exchange Error: Entire words swap places, usually within the same grammatical category (e.g., nouns swap with nouns).
Morpheme Exchange Errors: The root word stays, but the prefix/suffix swaps (e.g., "I'm thinly slicing" becomes "I'm slicely thinning").
Spoonerism: Named after William Spooner; involves switching the initial phonemes (e.g., "a blushing crow" instead of "a crushing blow").
Mechanisms of Speech Errors
Systematic Nature
Errors rarely result in illegal sound combinations in a language, suggesting a "monitor" checks for grammatical validity before execution.
Error Categories:
Anticipation: A later sound appears too early (e.g., "reading a list" $\rightarrow$ "leading a list").
Perseveration: An earlier sound persists into a later word (e.g., "pull a punch" $\rightarrow$ "pull a pull").
Causes of Speech Errors
General Rules of Language
Over-regularization (e.g., saying "goed" instead of "went") shows the brain's reliance on grammatical templates.
Competing Thoughts
The Lemma Selection Process: When two words have similar meanings, both are "activated" in the brain; if the competition isn't resolved, a blend or substitution occurs.
Monitoring and Adjusting in Speech Communication
Common Ground
Successful communication requires "audience design," where the speaker estimates what the listener already knows.
Egocentric Heuristic
The tendency for speakers to assume their internal state or knowledge is obvious to the listener, leading to ambiguous pronouns like "it" or "that."
Speech Production Processes Model
Levelt’s Model (1989)
Stage 1: Conceptualization: Generating the intention to speak.
Stage 2: Formulation:
Lexical Selection: Choosing the lemma (abstract meaning).
Morpho-phonological Encoding: Choosing the lexeme (vocal form).
Stage 3: Execution: The physical act of articulation.
Alternative Theories of Speech Production
Spreading Activation Theory (Dell, 1986)
Suggests that activation spreads through a network of nodes (phonetic, morphemic, semantic). Errors occur because an incorrect node receives more "activation" than the target node during the time of selection.
Speech and Communication Nuances
Prosody and Discourse Markers
Prosody: The "melody" of speech (rhythm and pitch) which provides emotional context or distinguishes a question from a statement.
Discourse Markers: Words like "well," "so," or "actually" that manage the flow of conversation and indicate how the listener should interpret the following message.
The Impact of Accents
Influence and Perception
Standard vs. Non-standard: Society often unfairly correlates standard accents with higher intelligence or authority.
Implicit Bias: Research suggests listeners may have lower comprehension memory when listening to an unfamiliar accent due to increased "cognitive load" in decoding.
Conclusion and Reflection
Understanding speech production reveals it to be one of the most complex human behaviors, requiring the seamless integration of high-level cognition and ultra-fast motor control. Continued study helps in treating speech pathologies and improving human-computer interaction.