Chapter 16: Language and Computers

Speech Synthesis

Speech synthesis- the use of a machine, usually a computer, to produce human-like speech
Canned speech- prerecorded utterances and phrases
Synthesized speech- piecing together smaller recorded units of speech into new utterances
Intelligibility- how well listeners can recognize and understand the individual sounds or words generated by the synthesis system
Naturalness- how much the synthesized speech sounds like the speech of an actual person
Articulatory synthesis- a synthesis technique that generates speech “from scratch” based on computational models of the shape of the human vocal tract and the articulation processes
Source-filter theory- there are two independent parts to the production of speech sounds
- Source- the mechanism that creates a basic sound
- Filter- shapes the sound created by the source into the different sounds we recognize as speech sounds
Concatenative Synthesis- uses recorded speech by stringing together pieces of the recorded speech and then smoothing the boundaries between them
- Unit selection synthesis- takes large samples of speech and builds a database of smaller units from these speech samples
- Diphone synthesis- pairs of adjacent sounds are attached at the end of one phone and the beginning of another
- Domain-specific synthesis- create utterances from prerecorded words and phrases that closely match the words and phrases that will be synthesized
Text-to-speech synthesis- speech generated directly from text entered with normal orthography

Automatic Speech Recognition

Automatic Speech Recognition- the conversion of an acoustic speech waveform into text
Noisy channel model- treats speech input as if it has been passed through a communication channel that garbles the speech waveform
@@Components of an Automatic Speech Recognition System:@@
- Signal processing- recording the speech waveform with a microphone and storing it in a manner that is suitable for further processing by a computer
- Acoustic modeling- mapping the energy values extracted during signal processing
- Pronunciation modeling- used to filter out unlikely sound sequences
- Language modeling- calculating the probability of sequences
@@Parameters of Speech Recognition Systems@@
- Speaking mode- only accepts isolated word input or continuous speech input
- Vocabulary size- the size of the system’s vocabulary will impact its accuracy
- Speaker Enrollment- the system may or may not need to be trained to a specific voice

Communicating with Computers

Interactive Text-Bases Systems- dialogue carried between computer and user via text
- Word spotting- a program focuses on words it knows and ignores ones it doesn’t
Spoken-Language Dialogue Systems- Dialogue carried between computer and user via speech
- Isolated speech- the user speaks the input clearly and without extraneous words
- Continuous speech- input can be more like normal speech
Components of a Spoken-language Dialogue System
- Automatic Speech Recognition- combining levels of linguistic knowledge in order to allow speaker-independent understanding of continuous speech
- @@Language Processing and Understanding@@- the system must decipher not only individual words, but also the intention of the speaker
- @@Dialogue Management@@- the system needs to understand the intentional structure of the conversation
- @@Text Generation@@- the use of computers to respond to humans using natural language by creating sentences that convey the relevant information
- @@Speech Synthesis@@- the words that make up the generated text must be converted into a sequence of sounds

Machine Translation

Translation- the task of converting the contexts of a text written in one language into a text in another language
Machine translation- the use of computers to carry our translation
Problems:
- Context can often be removed
- Lexical ambiguity
Partial Automation- the source language text can first be pre-edited by a person so as to “prime” it for a machine translation system

Corpus Linguistics

Corpus- a collected body of text
Corpus linguistics- involves the design and the annotation of corpus materials that are required for specific purposes
Corpus can be composed from spoken, signed or written language
Corpora can be classified by the genre of the source material
Balanced corpora- corpora that try to remain balanced among different genres
Reference corpus- specified amount of text that has been collected and annotated
Monitor corpus- as new texts continue to be written or spoken, more data is gathered