CS 410 Exam 2

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/104

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 2:50 AM on 4/8/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

105 Terms

1
New cards

What is the goal of Web Search?

Find small relevant data from massive web data

2
New cards

What are key components of a search engine?

Crawler, indexer, ranking, query processing

3
New cards

What is a crawler?

Program that traverses web pages and collects data

4
New cards

Toy crawler vs real crawler

Simple traversal vs handling scale, politeness, duplicates, dynamic pages

5
New cards

What challenges exist in web search?

Scale, spam, dynamic content, ranking quality, latency

6
New cards

What is indexing?

Organizing documents for efficient retrieval

7
New cards

What is link analysis?

Using hyperlinks to determine importance of pages

8
New cards

What is PageRank intuition?

Important pages are linked by other important pages

9
New cards

What is the random surfer model?

User randomly follows links or jumps to random pages

10
New cards

What is αα (alpha) in PageRank?

Probability of random jump

11
New cards

What happens with probability (1āˆ’Ī±)(1 - α)?

Follows links

12
New cards

What is PageRank score?

Probability of visiting a page

13
New cards

What is the transition matrix M?

Represents link probabilities between pages

14
New cards

What is the PageRank equation?

p=αI+(1āˆ’Ī±)M)Tpp=\alpha I+(1-\alpha)M)^{T}p

15
New cards

How is PageRank computer?

Power iteration until convergence

16
New cards

Why is PageRank an eigenvector problem?

p=App = Ap form

17
New cards

What problem does random jump solve

Dead ends + spider traps

18
New cards

What happens if a page has no outgoing links?

It distributes probability uniformly

19
New cards

What are hubs?

Pages that link to many authoritative pages

20
New cards

What are authorities?

Pages linked by many hubs

21
New cards

What is HITS idea?

Mutual reinforcement between hubs and authorities

22
New cards

What is the difference between PageRank and HITS?

Global vs. query-dependent

23
New cards

What is Learning to Rank?

ML approach to combine ranking features

24
New cards

Why use Learning to Rank?

Combines signals like TF-IDF, PageRank

25
New cards

What types of features are used in Learning to Rank?

Content, link, user behavior

26
New cards

What are the approaches for Learning to Rank?

Pointwise, pairwise, listwise

27
New cards

What is text mining?

Extracting actionable knowledge from text

28
New cards

What is the difference between text mining and retrieval?

Discover knowledge vs. find docs

29
New cards

What are 4 text mining perspectives?

Language, content, user, prediction

30
New cards

Why is text representation important?

Determines what analysis is possible

31
New cards

What is paradigmatic relation?

Words substitutable (same class) (eg. cat ↔ dog)

32
New cards

What is syntagmatic relation?

Words co-occur (semantic relation) (eg. eat ↔ food)

33
New cards

How do you detect paradigmatic?

Context similarity

34
New cards

How do you detect syntagmatic?

Co-occurence frequency

35
New cards

Why are word associations useful?

Improves NLP + retrieval tasks

36
New cards
What is logistic regression?
Classification model that outputs probabilities
37
New cards
What does logistic regression output?
Probability between 0 and 1
38
New cards
What is logit?
Log odds of probability
39
New cards
What is sigmoid function?

11+eāˆ’x\frac{1}{1+e^{-x}}

40
New cards
Why use logistic instead of linear regression?
It outputs probabilities for classification
41
New cards
What is a decision boundary?
Surface that separates different classes
42
New cards
What is a hyperplane?
Generalized linear boundary separating classes in higher dimensions
43
New cards
What is relationship between decision boundary and hyperplane?
The decision boundary is represented by a hyperplane
44
New cards

What defines the decision boundary?

wā‹…x+b=0w\cdot x + b = 0

45
New cards
What is an artificial neuron?
Weighted sum of inputs plus activation function
46
New cards
Neuron formula

y=Ļ•(wx+b)y=\phi\left(wx+b\right)

47
New cards

What are components of a neuron?

Weights, bias, activation function

48
New cards
What is an activation function?
Non linear transformation
49
New cards

Examples of activation functions

Sigmoid, ReLU, tanh

50
New cards
What is a hidden layer?
Layer that transforms inputs into new features
51
New cards
What is a deep neural network?
Neural network with multiple hidden layers
52
New cards
What is backpropagation?
Method to compute gradients and update weights
53
New cards
Steps of backpropagation
Forward pass compute loss backward pass update weights
54
New cards
Why is backpropagation needed?
To learn model parameters
55
New cards
What is an RNN?
Neural network with loops for sequential data
56
New cards
What does RNN capture?
Sequential dependencies
57
New cards
What is hidden state?
Memory of previous inputs
58
New cards
What problem do RNNs have?
Vanishing gradients
59
New cards
Why were LSTMs introduced?
To solve vanishing gradient problem
60
New cards
What are LSTM gates?
Mechanisms that control information flow
61
New cards

What are the three main LSTM gates?

Forget, input, and output gates

62
New cards
What is cell state in LSTM?
Long term memory
63
New cards
Key idea of LSTM
Selective memory control using gates
64
New cards
What is GRU?
Simplified version of LSTM
65
New cards
What gates does GRU have?
Update and reset gates
66
New cards
Difference between GRU and LSTM

Faster vs. more expressive

67
New cards

Why are RNNs slow?

They process sequences, sequentially O(n)O(n)

68
New cards
Why do RNNs struggle with long text?
Long dependencies and gradient issues
69
New cards
Why are RNNs hard to parallelize?
Each step depends on previous step
70
New cards
Why were transformers introduced?
To solve RNN limitations
71
New cards
Key idea of transformers
Use attention instead of sequence processing
72
New cards
Main advantage of transformers
Parallelization and long range dependencies
73
New cards
What is attention?
Mechanism that focuses on relevant words
74
New cards
Key idea of attention
Query relevant information from sequence
75
New cards
Attention can be viewed as what?
A fuzzy lookup table
76
New cards

What are Q K V in attention?

Query, key, and value vectors

77
New cards
How is attention computed?

Attention(Q,K,V)=softmax(QKTdk)VAttention(Q,K,V)=softmax\left(\frac{QK^{T}}{\sqrt{d_{k}}}\right)V

78
New cards
What is self attention?

QQ KK and VV come from the same sequence

79
New cards
Steps of self attention

Compute QQ KK VV, compute similarity, apply softmax, compute weighted sum

80
New cards
What are main components of transformer architecture
Self attention feed forward network residual connections layer normalization
81
New cards
What is multi head attention
Multiple attention mechanisms in parallel
82
New cards
Why use multi head attention
To capture different relationships
83
New cards
What is feed forward network in transformer
Adds non linear transformation
84
New cards
Why use residual connections
Stabilize training
85
New cards
Why use layer normalization
Improve training stability
86
New cards
Why is positional encoding needed
Transformers do not encode order naturally
87
New cards
How is positional encoding implemented
Using sine and cosine functions
88
New cards

What problems does attention have

Loss of order, loss of non linearity, loss of sequential constraints

89
New cards
How do transformers fix loss of order
Positional encoding
90
New cards
How do transformers fix loss of non linearity
Add feed forward layers
91
New cards
How do transformers fix sequential prediction
Mask future tokens
92
New cards
What type of model is GPT
Decoder only transformer
93
New cards
What direction does GPT use
Left to right
94
New cards
What is GPT used for
Text generation
95
New cards
What type of model is BERT
Encoder only transformer
96
New cards
What direction does BERT use
Bidirectional
97
New cards
What is BERT used for
Text understanding
98
New cards
What is CLS token
Special token representing whole sequence
99
New cards
What is SEP token
Separator between sentences
100
New cards
What is MASK token
Used to hide words during training