1/104
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What is the goal of Web Search?
Find small relevant data from massive web data
What are key components of a search engine?
Crawler, indexer, ranking, query processing
What is a crawler?
Program that traverses web pages and collects data
Toy crawler vs real crawler
Simple traversal vs handling scale, politeness, duplicates, dynamic pages
What challenges exist in web search?
Scale, spam, dynamic content, ranking quality, latency
What is indexing?
Organizing documents for efficient retrieval
What is link analysis?
Using hyperlinks to determine importance of pages
What is PageRank intuition?
Important pages are linked by other important pages
What is the random surfer model?
User randomly follows links or jumps to random pages
What is α (alpha) in PageRank?
Probability of random jump
What happens with probability (1āα)?
Follows links
What is PageRank score?
Probability of visiting a page
What is the transition matrix M?
Represents link probabilities between pages
What is the PageRank equation?
p=αI+(1āα)M)Tp
How is PageRank computer?
Power iteration until convergence
Why is PageRank an eigenvector problem?
p=Ap form
What problem does random jump solve
Dead ends + spider traps
What happens if a page has no outgoing links?
It distributes probability uniformly
What are hubs?
Pages that link to many authoritative pages
What are authorities?
Pages linked by many hubs
What is HITS idea?
Mutual reinforcement between hubs and authorities
What is the difference between PageRank and HITS?
Global vs. query-dependent
What is Learning to Rank?
ML approach to combine ranking features
Why use Learning to Rank?
Combines signals like TF-IDF, PageRank
What types of features are used in Learning to Rank?
Content, link, user behavior
What are the approaches for Learning to Rank?
Pointwise, pairwise, listwise
What is text mining?
Extracting actionable knowledge from text
What is the difference between text mining and retrieval?
Discover knowledge vs. find docs
What are 4 text mining perspectives?
Language, content, user, prediction
Why is text representation important?
Determines what analysis is possible
What is paradigmatic relation?
Words substitutable (same class) (eg. cat ā dog)
What is syntagmatic relation?
Words co-occur (semantic relation) (eg. eat ā food)
How do you detect paradigmatic?
Context similarity
How do you detect syntagmatic?
Co-occurence frequency
Why are word associations useful?
Improves NLP + retrieval tasks
1+eāx1ā
What defines the decision boundary?
wā x+b=0
y=Ļ(wx+b)
What are components of a neuron?
Weights, bias, activation function
Examples of activation functions
Sigmoid, ReLU, tanh
What are the three main LSTM gates?
Forget, input, and output gates
Faster vs. more expressive
Why are RNNs slow?
They process sequences, sequentially O(n)
What are Q K V in attention?
Query, key, and value vectors
Attention(Q,K,V)=softmax(dkāāQKTā)V
Q K and V come from the same sequence
Compute Q K V, compute similarity, apply softmax, compute weighted sum
What problems does attention have
Loss of order, loss of non linearity, loss of sequential constraints