1/215
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Q: Abjads do not usually represent vowels, only consonants
A: True
Q: How many bits does ASCII use to store English text?
A: 7 bits
Q: Automatic spelling checkers run on a whole document, find errors, and make corretions
A: True
Q: When TurnItIn flags a submitted homework as possible plagiarism, what type of machine learning did it probably use to do this?
A: Supervised machine learning (classification)
Q: Chatbots mimic informal human chatting while dialogue agents act as personal assistants to complete some task(s)
A: True
Q: ____ is when a speaker alternates between two or more languages, or language varieties, in the context of a single conversation or situation
A: Codeswitching
Q: Late assignments will be accepted with a medical or official accommodation letter
A: False
Q: Which of the following is NOT a cause for spelling errors?
A: Technological
Q: Machine learning is the main "engine" of modern artificial intelligence
A: True
Q: The earliest chatbot, ELIZA, was designed to mimic a paranoid schizophrenic
A: False
Q: A Wikipedia page is an example of structured data
A: False
Q: The phones that speakers hear are always pronounced exactly the same by every speaker in every word
A: False
Q: Machine learning involves a little bit of probability and statistics but relies a lot on world knowledge
A: False
Q: Consider the following example of learner language: they are very kind and friendship. Based on its distribution (word/linear order), what is the part-of-speech (POS) of friendship?
A: Adjective
Q: A grammatical constituent is a group of words that act as one unit in a sentence and can be substituted in a sentence by a constituent of the same category without violating the language's grammatical rules
A: True
Q: What is an example of structured data?
A: Excel spreadsheets
Q: Unicode is an improvement over ASCII because Unicode can encode all characters of all writing systems
A: True
Q: Match the type of strategy of doing feature engineering with its definition
a. Kitchen sink strategy
b. Hand-crafted strategy
A: (a) Use lots of features in the hope that some will be relevant and useful
(b) Use careful thought to try to identify, ahead of time, a small set of features that are likely to be relevant
Q: In multilingual NLP, the "vocabulary challenge" refers to the fact that some speakers use fewer words than speakers of other languages
A: False
Q: Consider the following example of learner language: they are very kind and friendship. Based on morphological evidence, what is the POS of friendship (hint: consider the POS of English words that are morphologically similar)?
A: Noun
Q: The basic principle of encoding writing in binary (such as with Unicode or ASCII) is that every ___ is encoded as a specific ___
A: Character, numeric value
Q: To correct the spelling of a word. Select the ones that apply
A: Rank candidates, detect an error
Q: Which one of these sentences is more likely to be generated by a Unigram language model?
A: Hill he late speaks; or! A more to leg less first you enter
Q: Which of these statements is a performative utterance (speech act)?
A: I forgive you
Q: What are examples of stop words?
A: a, the, in, but
Q: Which citation style is required for the final writing requirement in this course?
A: APA
Q: A full representation of Unicode (UTF-32) takes __ bytes to encode one character
A: 4
Q: Hoow mnay isolated ssspeliing errors are in this questions?
A: 3
Q: Match the term with its characteristic
a. Supervised learning
b. Unsupervised learning
A: (a) The training data and the test data have been labeled with the desired "correct answers"
(b) There are no prespecified categories in the training data and the test data
Q: You notice over the course of a month that your spam filter correcrly flagged 1300 spam messages, but it missed 25, and it incorrectly said that there were 100 spam messages (which were actually ham). This was out of a total of 2100 messeges that you received. What is the precision of the spam filter?
A: 1300/1400
Q: Match the technique to its application in spellchecking.
a. Dictionary look-up
b. N-gram analysis (character-level)
c. Transition and confusion probabilities
A: (a) error detection
(b) error detection
(c) ranking correction candidates
Q: Sentiment analysis uses supervised machine learning to detect the attitude of a writer
A: True
Q: Speech is a kind of _______ that can be accomplished directly or indirectly.
A: Action
Q: In the translation triangle, the bottom corners are:
A: the source and target languages
Q: What are the two basic types of writing systems?
A: Meaning-based (logographic), Sound-based (letters)
Q: In natural language processing, the task of identifying dates, addresses, or names of people or companies in a text is referred to as __________.
A: Named entity recognition
Q: The first step used in probabilistic methods is:
A: count misspellings in the text
Q: "Erde" in German and "Earth" in English are examples of cognates.
A: True
Q: Which of the following is NOT a type of the ASR systems?
A: Language dependent
Q: What does the following equation represent? P(B|A) = P(A and B) / P(A)
A: Conditional probability
Q: Large Language Models (LLMs) like ChatGPT are corpus-based dialogue systems that require massive amounts of training to work successfully.
A: True
Q: The final essay for the class is graded as:
A: Satisfactory/Unsatisfactory/Fail
Q: Which pair of words are homophones?
A: kernel/colonel
Q: The basic set-up for machine learning is to train the statistical parameters of the model on some data (i.e., training set) and then use this trained model to compute probabilities of some new data (i.e., test set). This new data set could not be exactly the same data we trained on.
A: True
Q: Search engines rank search results based on how many other pages link to the pages in the results.
A: True
Q: The format of the final version of the essay should be:
A: PDF
Q: Match the technique to its application in spellchecking.
a. Similarity key
b. Minimum edit distance
A: (a) candidate generation
(b) ranking correction candidates
Q: What is grounding in a conversation?
A: The hearer/listener acknowledges they understand
Q: A search engine does not actually search the whole Internet.
A: True
Q: What do ASR and TTS do?
A: ASR maps sound to text / TTS maps text to sound
Q: In binary, the positions in an eight-digit number encode:
A: 128s, 64s, 32s, 16s, 8s, 4s, 2s, 1s
Q: Nothing is due on the writing requirement until the last day of class
A: False
Q: What type of error occurs in the following sentence?
They are leaving in about fifteen minuets to go to her house
A: Semantic error
Q: What does the "low-resource challenge" refer to?
A: Many languages have limited digital written data available
Q: What is the definition of intonation?
A: The rise and fall in a speaker's pitch (frequency)
Q: Which are THREE articulatory features that we use to describe consonants?
A: Place of articulation, voicing, manner of articulation
Q: Which is NOT one of the string edit operations that are important to count when building a spellchecker?
A: Redundancy
Q: Which type of analysis or model is the basis for probabilistic grammar checkers, large language models (LLM) such as ChatGPT, and predictive text applications?
A: n-grams
Q: How many possible characters does ASCII have?
A: 128
Q: Math the term with its best example
a. Inflectional affix
b. Derivational affix
A: (a) walk - walks
(b) catch - catcher
Q: The simplest policy of classifying documents is to pretend that we are dealing with a completely unstructured collection of words (bag-of-words assumption), because it is realistic about how language is used
A: False
Q: In general, ASR systems go through these steps:
A: Information loss; acoustic signal processing; the recognition of sounds, group of sounds, and words
Q: Which of the following are properties of human conversation?
A: Turns, utterances, grounding
A chatbot is different from a voice assistant because a voice assistant...
A language that is no longer being passed to younger generations
Match the Gricean conversational maxim to its definition:
quality
quantity
relevance/relation
manner
be truthful, be exactly information as required, be relevant to the purpose and direction of the exchange, be clear
What are two advantages searching unstructured data has over searching structured data?
Possibility to search any page that is indexed on the web, does not require as much human effort to prepare page
You notice over the course of a month that your spam filter correctly flagged 1300 spam messages, but it missed 25. Also, it incorrectly said that there were 100 spam messages (which were actually ham). This was out of a total of 2100 messages that you received. What is the recall of the spam filter?
1300/1325
The four Gricean maxims, or rules, of conversation are based on what principle?
The cooperative principle: the speakers are trying to cooperate in the conversation
Typical applications of document classification include all of the following except:
authorship attribution
grammar checking
sentiment analysis
spam filtering
Grammar checking
A chatbox is different from a voice assistant because a voice assistant...
is usually a limited-domain dialogue system that is not designed to accomplish a particular task
We encounter an English learner who says "France in" and "the part at", e.g., "I saw him the party at." In other words, they are treating this as a postposition (found in other languages), instead of a preposition. What is the correct phrase structure that would catch this grammar error?
PP -> PNP
When you have four pencils but you answer "Two" when someone asks "How many pencils do you have?" this is an example of an uncooperative violation of the Maxim of Quantity or Quality.
True
Sentiment analysis is a kind of supervised machine learning (classification) that can automatically insert positive and negative sentiments into written texts.
False
One of the biggest challenges facing modern AI language technology is that most languages do not have sufficient data available to train state-of-the-art algorithms.
True
A corpus is an example of unstructured or semi-structured data.
True
In order for a computer to do anything with a language, it needs a way of representing language.
True
Interlingua
a language independent representation of a meaning in MT
Deskilling
An effect of introducing new technology: certain jobs can be carried out by less educated employees, earning lower wages
Stop words
Words which are ignored in searches
Parsing
Determining or annotating the structure of a sentence
What are two ways that a computer must be able to represent natural human languages?
audio, text
Which of these speech acts is direct?
can you wash the dishes?
wash the dishes.
please wash the dishes.
I'd like if you could wash the dishes
Wash the dishes, please wash the dishes
Which elements would an interlingua system ignore to represent the following sentence for both Spanish-to-English and English-to-Spanish translation? Did you see the white cow? Viste una vaca blanca?
Language specific structure
Which is a direct speech act?
It seems like it'd be cooler if the windows were open
could you open the window?
open the window
I like the window open
open the window
What are examples of language usage that go beyond the functional use of conveying information and voice requests?
expressing individual and social identity
The latin word hospes can be translated into English as host, guest, friend, or stranger. Describe the relationship between the latin word and the four english words in terms of hypernymy and hyponymy
Hospes can be considered a hypernym of host, guest, friend, or stranger; host, guest, friend, or stranger can be considered a hyponym of hospes
The earliest example of a dialogue system was called ELIZA
true
One of the earliest applications of AI was computer-assisted language learning
False
A bilingual speaker is defined as someone equally conversant in all domains in two languages
False
Modern state-of-the-art machine learning algorithms are called "neural networks" because they are an exact replica of human brain neuron pathways
False
Because the cooperative principle holds true for all conversations, its sub-principles known as the Gricean conversational maxims are never violated
false
Microtransistors
Basic electrical component of computers that store data in either 1s or 0s (binary), with each transistor being 1 bit of information
Bit
single unit of computer information
Byte
unit of information with 8 bits
2 main types of writing systems
Logographic (meaning) and letters (sound)
What are the types of logographic writing systems?
Pictographs (pictures), ideographs (ideas), semantic-phonetic (symbols)
what are the main types of letter writing systems?
alphabetic and syllabic
what are the types of alphabetic writing systems
alphabet (phonemic alphabets, representing all sounds)and abjad (consonant alphabets)
what are the types of syllabic writing systems
abugida (symbols represent a consonant+vowel, but vowel or consonant can be changed by changing the symbol or adding diacritics) and syllabary (separate symbol for each syllable of a language)