Chapter 16: Language and Computers

Speech Synthesis

  • Speech synthesis- the use of a machine, usually a computer, to produce human-like speech
  • Canned speech- prerecorded utterances and phrases
  • Synthesized speech- piecing together smaller recorded units of speech into new utterances
  • Intelligibility- how well listeners can recognize and understand the individual sounds or words generated by the synthesis system
  • Naturalness- how much the synthesized speech sounds like the speech of an actual person
  • Articulatory synthesis- a synthesis technique that generates speech “from scratch” based on computational models of the shape of the human vocal tract and the articulation processes
  • Source-filter theory- there are two independent parts to the production of speech sounds
    • Source- the mechanism that creates a basic sound
    • Filter- shapes the sound created by the source into the different sounds we recognize as speech sounds
  • Concatenative Synthesis- uses recorded speech by stringing together pieces of the recorded speech and then smoothing the boundaries between them
    • Unit selection synthesis- takes large samples of speech and builds a database of smaller units from these speech samples
    • Diphone synthesis- pairs of adjacent sounds are attached at the end of one phone and the beginning of another
    • Domain-specific synthesis- create utterances from prerecorded words and phrases that closely match the words and phrases that will be synthesized
  • Text-to-speech synthesis- speech generated directly from text entered with normal orthography

\

Automatic Speech Recognition

  • Automatic Speech Recognition- the conversion of an acoustic speech waveform into text
  • Noisy channel model- treats speech input as if it has been passed through a communication channel that garbles the speech waveform
  • @@Components of an Automatic Speech Recognition System:@@
    • Signal processing- recording the speech waveform with a microphone and storing it in a manner that is suitable for further processing by a computer
    • Acoustic modeling- mapping the energy values extracted during signal processing
    • Pronunciation modeling- used to filter out unlikely sound sequences
    • Language modeling- calculating the probability of sequences
  • @@Parameters of Speech Recognition Systems@@
    • Speaking mode- only accepts isolated word input or continuous speech input
    • Vocabulary size- the size of the system’s vocabulary will impact its accuracy
    • Speaker Enrollment- the system may or may not need to be trained to a specific voice

\

Communicating with Computers

  • Interactive Text-Bases Systems- dialogue carried between computer and user via text
    • Word spotting- a program focuses on words it knows and ignores ones it doesn’t
  • Spoken-Language Dialogue Systems- Dialogue carried between computer and user via speech
    • Isolated speech- the user speaks the input clearly and without extraneous words
    • Continuous speech- input can be more like normal speech
  • Components of a Spoken-language Dialogue System
    • Automatic Speech Recognition- combining levels of linguistic knowledge in order to allow speaker-independent understanding of continuous speech
    • @@Language Processing and Understanding@@- the system must decipher not only individual words, but also the intention of the speaker
    • @@Dialogue Management@@- the system needs to understand the intentional structure of the conversation
    • @@Text Generation@@- the use of computers to respond to humans using natural language by creating sentences that convey the relevant information
    • @@Speech Synthesis@@- the words that make up the generated text must be converted into a sequence of sounds

\

Machine Translation

  • Translation- the task of converting the contexts of a text written in one language into a text in another language
  • Machine translation- the use of computers to carry our translation
  • Problems:
    • Context can often be removed
    • Lexical ambiguity
  • Partial Automation- the source language text can first be pre-edited by a person so as to “prime” it for a machine translation system

\

Corpus Linguistics

  • Corpus- a collected body of text
  • Corpus linguistics- involves the design and the annotation of corpus materials that are required for specific purposes
  • Corpus can be composed from spoken, signed or written language
  • Corpora can be classified by the genre of the source material
  • Balanced corpora- corpora that try to remain balanced among different genres
  • Reference corpus- specified amount of text that has been collected and annotated
  • Monitor corpus- as new texts continue to be written or spoken, more data is gathered

\