CS121 Quiz 4

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/20

There's no tags or description

Looks like no tags are added yet.

Last updated 3:39 AM on 5/24/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

21 Terms

New cards

MapReduce

distributed programming tool used for indexing and analysis

New cards

mapper

transforms a list of items into another list of items of the same length

New cards

reducer

transforms a list of items into a single item

New cards

distributed processing

uses large number of inexpensive servers driven by the need to index and analyze big data

New cards

director server

distribtues the query to multiple indexing machines

New cards

index server

only processes part of the query

New cards

director machine

organizes the results and returns them to the user

New cards

Jaccard coefficient

Jaccard(A,B) = |A n B|/|AUB|

New cards

bag of words model

each document is stored as a vector of word occurrence counts, ignoring the order of the words.

New cards

term frequency

number of times term t occurs in document d

New cards

score for the doc-query pair

the sum over terms t in both q and d

New cards

inverse document frequency (IDF)

log(N/df)

no effect on one term queries

New cards

document frequency(df)

number of documents that contain term t

New cards

collection frequency

number of occurrences of t in the collection including duplicates

New cards

TF-IDF

tf-idf = (1+log(tf)) * log(N/df)

New cards

document as vector: the terms

axes of the space

New cards

documents as vectors: the documents

points in the space

New cards

cosine(query, document)

cos(q,d) = (q*d)/|q|*|v|

New cards

normalize vector by length

||x|| = sum x_i²

New cards

vector space ranking steps

represent query and document as weighted tf-idf vectors
compute cosine similarity scores for both
rank documents with respect to query by score
return top k

New cards

cosine for length-normalized vectors

cos(q,d) = sum i = 1 to |v| [(qi*di)]