Note
0.0(0)
MR

Question Answering Notes

Information Retrieval vs. Question Answering

Information retrieval is the standard name, but traditionally it involved document retrieval, leaving further analysis to the user. Modern search engines have improved by combining information retrieval with knowledge graphs, inferencing, query history, location data, and natural language processing. Question Answering (QA) focuses on building systems that automatically answer questions posed by humans in natural language.

Examples of User Questions

Around 10-20% of query logs consist of questions like:

  • how much should I weigh
  • what does my name mean
  • how to get pregnant
  • where can I find pictures of hairstyles
  • who is the richest man in the world
  • what is the meaning of life
  • why is the sky blue
  • what is the difference between white eggs and brown eggs
  • can you drink milk after the expiration date
  • what is true love
  • what is the jonas brothers address

Historical Context: Google's Initial Approach

In the past, Google's approach involved finding the question as a string on the web and returning the subsequent sentence as the answer. This worked effectively for FAQ-style questions but often failed otherwise. A more sophisticated version combines knowledge graphs, N-grams, WordNet, and NLP techniques.

Example: Question: Who was the prime minister of Australia during the Great Depression? Answer: James Scullin (Labor) 1929–31

Semantic Difficulties in Question Answering

Many questions present semantic challenges, such as:

  • Who is Michael Jordan? (basketball player or machine learning expert?)

Entity identification and disambiguation are crucial.

The Necessity of Natural Language Processing (NLP)

NLP is essential because keyword matching is insufficient. Consider the question: "When was Wendy’s founded?" A passage might mention Wendy Moonan and the founding of the Murano glassmaking industry in 1291, leading to an incorrect answer.

Example: The renowned Murano glassmaking industry, on an island in the Venetian lagoon, has gone through several reincarnations since it was founded in 1291. Three exhibitions of 20th-century Murano glass are coming up in New York. By Wendy Moonan.

NLP Challenges: Predicate-Argument Structure

Identifying the relationship between entities is crucial. For instance, with the question "When was Microsoft established?", the system needs to differentiate between Microsoft establishing partnerships and the establishment of Microsoft itself. A correct answer might not even include the query term.

Example: Microsoft Corp was founded in the US in 1975, incorporated in 1981, and established in the UK in 1982.

Questions Requiring Inference

Some questions necessitate inference, posing a challenge for search engines.

Example: What is the distance between the largest city in California and the largest city in Nevada?

Data Limitations

Sometimes, the required data may not exist or be readily accessible.

Example: how many Ph.D. degrees in mathematics were granted by European universities in 1986?

Popular Question/Answering Products

  • Siri: Maps queries to known entities and uses existing internet databases.
  • Ask.com: Detects question type and uses search engine snippets.
  • IBM’s Watson: Combines the approaches used by Siri and Ask.com.
  • Google's Knowledge Graph: Uses an in-house entity-relationship graph and infers the answer.

Siri's Knowledge-Based Approach

Siri, initially a DARPA project, employs a multi-step process:

  1. Voice query recognition and language model application.
  2. Semantic representation of the query, extracting relevant data.
  3. Mapping semantics to structured data and resources like geospatial databases, ontologies (Wikipedia infoboxes, Freebase, WordNet, Yago), restaurant reviews (Yelp), scientific databases (Wolfram Alpha), and conventional search engines (Google, Bing).
  4. Transformation of the output into natural language and then back to speech.

Context and Conversation in Virtual Assistants

Coreference resolution helps resolve ambiguities, and clarification questions are used to understand the user's intent.

Examples: U: “book a table at Il Fornaio at 7:00 with my mom”. U: “also send her an email reminder”

U: “chicago pizza”. S: “Did you mean pizza restaurants in Chicago or Chicago- style pizza?”

IBM’s WATSON System

Watson is a QA computing system that applies natural language processing, information retrieval, knowledge representation, automated reasoning, and machine learning. Although it won a Jeopardy contest, it has faced challenges in meeting earlier expectations. However, it performs well on standard natural language tasks.

AskJeeves (Ask.com)

Previously known for specializing in Q&A, AskJeeves is now less effective than Google.

Question Types

Questions fall into distinct categories:

  • Who: Person, Organization
  • When: Date, Year
  • Where: Location
  • In What: Location
  • How many: Number

Three Main Phases for Question/Answering

  1. QUESTION PROCESSING: Detect question type, identify entities, and formulate search engine queries.
  2. PASSAGE RETRIEVAL: Retrieve ranked documents (snippets), break them into passages, and match against entities.
  3. ANSWER PROCESSING: Extract and rank candidate answers using evidence from text and external sources.

Question Answering Block Architecture

This architecture includes question processing, passage retrieval, and answer extraction, utilizing tools like WordNet, NER (Named Entity Recognition), and POS (Part-of-Speech) parsers.

Question Taxonomy

Questions can be organized into taxonomies (e.g., reason, number, manner, location). Factoid questions (who, where, when, how many) have predictable answer categories.

General Capabilities for Question-Answering Systems

  1. Part-of-Speech Tagging: Assigns parts of speech (noun, verb, adjective, etc.) using Markov Models and algorithms like Viterbi, Brill tagger, and Baum-Welch.
  2. Named Entity Extraction: Locates and classifies named entities (persons, organizations, locations, times, quantities).
  3. Determining Semantic Relations: Identifies meanings between entities using WordNet and dictionaries/thesauri.

Question Processing Tools

Tools include part-of-speech recognizers and named entity recognizers to identify information units (names, locations, numeric expressions).

NLP Extraction and Knowledge Graphs

Nouns are typically entities, verbs are relationships, and adjectives describe relationships.

Extracting Candidate Answers from Triple Stores

After extracting a relation from the question, information sources (Wikipedia infoboxes, DBpedia, FreeBase) can be queried via a triple store.

General Keyword Selection Algorithm

  1. Identify nouns, verbs, non-stopwords in quotations, NNP words in recognized named entities, and complex nominals with modifiers.
  2. Select the answer type word.

Expanding the Keyword Set Using Variants

  1. Morphological variants: invented, inventor, inventions
  2. Lexical variants: killer, assassin; far, distance
  3. Semantic variants: like, prefer

Incorporating Lexical Variants Using Hypernims and Hyponims

WordNet provides hypernyms (superordinate groupings) and hyponyms (more specific terms).

Examples: question: When was the internal combustion engine invented? Answer: The first internal combustion engine was built in 1867.

Lexical chains: (1) invent:v#1 ® HYPERNIM ® createbymental_act:v#1 ® HYPERNIM ® create:v#1 ® HYPONIM ® build:v#1

question: How many chromosomes does a human zygote have? Answer: 46 chromosomes lie in the nucleus of every normal human cell.

Lexical chains: (1) zygote:n#1 ® HYPERNIM ® cell:n#1 ® HAS.PART ® nucleus:n#1

Using WordNet for Type Identification

WordNet refines the type of answer and merges named entities with a hierarchy.

Semantic Similarity

One measure is Leacock-Chodorow similarity, defined as:

d = -ln( \frac{æ}{2})

Passage Retrieval

After formulating queries, send them to a search engine and retrieve snippets. Filter results by answer type and rank passages based on a trained classifier.

Features: Question keywords, Named Entities, Longest overlapping sequence, Shortest keyword-covering span, N-gram overlap.

Passage Scoring Method

Passage ordering involves:

  1. Number of words from the question in sequence in the snippet window.
  2. Number of words separating the most distant keywords.
  3. Number of unmatched keywords.

Ranking Candidate Answers

Ranking considers factors like answer type, text passage content, and word proximity.

Local Alignment Example

Involves identifying relationships between question head words and anchor words in candidate answer passages.

Refined Ranking Scheme

Supervised machine learning ranks passages based on:

  1. Number of named entities of the right type.
  2. Number of question keywords.
  3. Longest exact sequence of keywords.
  4. Rank of the document.
  5. Proximity of keywords.
  6. N-gram overlap.

BERT (Bidirectional Encoder Representations from Transformers)

BERT helps computers understand language by using surrounding text for context. It has strong results in sentiment analysis, semantic role labeling, and disambiguation. Google applies BERT to its search algorithms. BERT is pre-trained using unlabeled text and continues to learn.

Why BERT is Needed

BERT captures relationships in a bidirectional way, unlike context-free models like word2vec or GloVe.

How BERT Works

BERT reads bidirectionally, accounting for the effect of all other words in a sentence on the focus word.

BERT Pre-Trained Models

Variants of BERT are pre-trained on specialized corpora (patentBERT, docBERT, bioBERT, VideoBERT).

Microsoft’s AskMSR Answering System

AskMSR relies on scattered web information and simple methods.

AskMSR Steps

  1. Query Rewriting: Classify question categories and apply transformation rules.
  2. Query Search Engine: Send rewrites and retrieve top answers (snippets).
  3. Mining N-Grams: Enumerate N-grams in retrieved snippets and weight them by reliability.
  4. Filtering N-Grams: Use data-type filters (regular expressions) to boost or lower scores.
  5. Tiling the Answers: Merge N-grams to tile highest-scoring N-grams, repeating until no more overlap.
Note
0.0(0)