Optimizing Query Evaluation

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/17

flashcard set

Earn XP

Description and Tags

Vocabulary flashcards related to optimizing query evaluation techniques.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

18 Terms

1
New cards

Distributed Query Evaluation

A process that speeds up query processing by sending queries to a director machine, which then distributes them to multiple index servers for processing.

2
New cards

Document Distribution

A distributed query evaluation approach where each index server acts as a search engine for a small fraction of the total collection of documents.

3
New cards

Term Distribution

A distributed query evaluation approach where a single index is built for the entire cluster of machines, and each inverted list within that index is assigned to one index server.

4
New cards

Query Caching

A technique to improve effectiveness by caching popular query results and common inverted lists, which can help even with unique queries.

5
New cards

Hapax Legomena

Words that occur only once in a corpus.

6
New cards

Document-at-a-time

A query processing approach that calculates complete scores for documents by processing all term lists, one document at a time.

7
New cards

Term-at-a-time

A query processing approach that accumulates scores for documents by processing term lists one at a time.

8
New cards

getCurrentDocument()

A pseudocode function that returns the document number of the current posting of the inverted list.

9
New cards

skipForwardToDocument(d)

A pseudocode function that moves forward in the inverted list until getCurrentDocument() <= d.

10
New cards

movePastDocument(d)

A pseudocode function that moves forward in the inverted list until getCurrentDocument() < d.

11
New cards

moveToNextDocument()

A pseudocode function Equivalent to movePastDocument(getCurrentDocument()).

12
New cards

getNextAccumulator(d)

A pseudocode function that returns the first document number d' >= d that has already has an accumulator.

13
New cards

removeAccumulatorsBetween(a, b)

A pseudocode function that removes all accumulators for documents numbers between a and b.

14
New cards

Conjunctive Processing

A type of query optimization where every returned document must contain all query terms and works best when one of the query terms is rare.

15
New cards

Threshold Methods

Query processing optimization techniques that use the number of top-ranked documents needed (k) to estimate a threshold score (τʹ) to ignore documents.

16
New cards

MaxScore Method

Compares the maximum score a remaining document could have to the estimated threshold and ignores parts of inverted lists that will not generate document scores above the threshold.

17
New cards

Early Termination

An approach to query processing that improve performance but may sacrafice result quality.

18
New cards

List ordering

An approach that orders inverted lists by a quality metric (e.g., PageRank) or by partial score to produce good documents.