CS 410 Exam 2

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/104

There's no tags or description

Looks like no tags are added yet.

Last updated 2:50 AM on 4/8/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

105 Terms

New cards

What is the goal of Web Search?

Find small relevant data from massive web data

New cards

What are key components of a search engine?

Crawler, indexer, ranking, query processing

New cards

What is a crawler?

Program that traverses web pages and collects data

New cards

Toy crawler vs real crawler

Simple traversal vs handling scale, politeness, duplicates, dynamic pages

New cards

What challenges exist in web search?

Scale, spam, dynamic content, ranking quality, latency

New cards

What is indexing?

Organizing documents for efficient retrieval

New cards

What is link analysis?

Using hyperlinks to determine importance of pages

New cards

What is PageRank intuition?

Important pages are linked by other important pages

New cards

What is the random surfer model?

User randomly follows links or jumps to random pages

New cards

What is $α$ (alpha) in PageRank?

Probability of random jump

New cards

What happens with probability $(1 - α)$ ?

Follows links

New cards

What is PageRank score?

Probability of visiting a page

New cards

What is the transition matrix M?

Represents link probabilities between pages

New cards

What is the PageRank equation?

$p=\alpha I+(1-\alpha)M)^{T}p$

New cards

How is PageRank computer?

Power iteration until convergence

New cards

Why is PageRank an eigenvector problem?

$p = Ap$ form

New cards

What problem does random jump solve

Dead ends + spider traps

New cards

What happens if a page has no outgoing links?

It distributes probability uniformly

New cards

What are hubs?

Pages that link to many authoritative pages

New cards

What are authorities?

Pages linked by many hubs

New cards

What is HITS idea?

Mutual reinforcement between hubs and authorities

New cards

What is the difference between PageRank and HITS?

Global vs. query-dependent

New cards

What is Learning to Rank?

ML approach to combine ranking features

New cards

Why use Learning to Rank?

Combines signals like TF-IDF, PageRank

New cards

What types of features are used in Learning to Rank?

Content, link, user behavior

New cards

What are the approaches for Learning to Rank?

Pointwise, pairwise, listwise

New cards

What is text mining?

Extracting actionable knowledge from text

New cards

What is the difference between text mining and retrieval?

Discover knowledge vs. find docs

New cards

What are 4 text mining perspectives?

Language, content, user, prediction

New cards

Why is text representation important?

Determines what analysis is possible

New cards

What is paradigmatic relation?

Words substitutable (same class) (eg. cat ↔ dog)

New cards

What is syntagmatic relation?

Words co-occur (semantic relation) (eg. eat ↔ food)

New cards

How do you detect paradigmatic?

Context similarity

New cards

How do you detect syntagmatic?

Co-occurence frequency

New cards

Why are word associations useful?

Improves NLP + retrieval tasks

New cards

What is logistic regression?

Classification model that outputs probabilities

New cards

What does logistic regression output?

Probability between 0 and 1

New cards

What is logit?

Log odds of probability

New cards

What is sigmoid function?

$\frac{1}{1+e^{-x}}$

New cards

Why use logistic instead of linear regression?

It outputs probabilities for classification

New cards

What is a decision boundary?

Surface that separates different classes

New cards

What is a hyperplane?

Generalized linear boundary separating classes in higher dimensions

New cards

What is relationship between decision boundary and hyperplane?

The decision boundary is represented by a hyperplane

New cards

What defines the decision boundary?

$w\cdot x + b = 0$

New cards

What is an artificial neuron?

Weighted sum of inputs plus activation function

New cards

Neuron formula

$y=\phi\left(wx+b\right)$

New cards

What are components of a neuron?

Weights, bias, activation function

New cards

What is an activation function?

Non linear transformation

New cards

Examples of activation functions

Sigmoid, ReLU, tanh

New cards

What is a hidden layer?

Layer that transforms inputs into new features

New cards

What is a deep neural network?

Neural network with multiple hidden layers

New cards

What is backpropagation?

Method to compute gradients and update weights

New cards

Steps of backpropagation

Forward pass compute loss backward pass update weights

New cards

Why is backpropagation needed?

To learn model parameters

New cards

What is an RNN?

Neural network with loops for sequential data

New cards

What does RNN capture?

Sequential dependencies

New cards

What is hidden state?

Memory of previous inputs

New cards

What problem do RNNs have?

Vanishing gradients

New cards

Why were LSTMs introduced?

To solve vanishing gradient problem

New cards

What are LSTM gates?

Mechanisms that control information flow

New cards

What are the three main LSTM gates?

Forget, input, and output gates

New cards

What is cell state in LSTM?

Long term memory

New cards

Key idea of LSTM

Selective memory control using gates

New cards

What is GRU?

Simplified version of LSTM

New cards

What gates does GRU have?

Update and reset gates

New cards

Difference between GRU and LSTM

Faster vs. more expressive

New cards

Why are RNNs slow?

They process sequences, sequentially $O(n)$

New cards

Why do RNNs struggle with long text?

Long dependencies and gradient issues

New cards

Why are RNNs hard to parallelize?

Each step depends on previous step

New cards

Why were transformers introduced?

To solve RNN limitations

New cards

Key idea of transformers

Use attention instead of sequence processing

New cards

Main advantage of transformers

Parallelization and long range dependencies

New cards

What is attention?

Mechanism that focuses on relevant words

New cards

Key idea of attention

Query relevant information from sequence

New cards

Attention can be viewed as what?

A fuzzy lookup table

New cards

What are Q K V in attention?

Query, key, and value vectors

New cards

How is attention computed?

$Attention(Q,K,V)=softmax\left(\frac{QK^{T}}{\sqrt{d_{k}}}\right)V$

New cards

What is self attention?

$Q$ $K$ and $V$ come from the same sequence

New cards

Steps of self attention

Compute $Q$ $K$ $V$ , compute similarity, apply softmax, compute weighted sum

New cards

What are main components of transformer architecture

Self attention feed forward network residual connections layer normalization

New cards

What is multi head attention

Multiple attention mechanisms in parallel

New cards

Why use multi head attention

To capture different relationships

New cards

What is feed forward network in transformer

Adds non linear transformation

New cards

Why use residual connections

Stabilize training

New cards

Why use layer normalization

Improve training stability

New cards

Why is positional encoding needed

Transformers do not encode order naturally

New cards

How is positional encoding implemented

Using sine and cosine functions

New cards

What problems does attention have

Loss of order, loss of non linearity, loss of sequential constraints

New cards

How do transformers fix loss of order

Positional encoding

New cards

How do transformers fix loss of non linearity

Add feed forward layers

New cards

How do transformers fix sequential prediction

Mask future tokens

New cards

What type of model is GPT

Decoder only transformer

New cards

What direction does GPT use

Left to right

New cards

What is GPT used for

Text generation

New cards

What type of model is BERT

Encoder only transformer

New cards

What direction does BERT use

Bidirectional

New cards

What is BERT used for

Text understanding

New cards

What is CLS token

Special token representing whole sequence

New cards

What is SEP token

Separator between sentences

100

New cards

What is MASK token

Used to hide words during training