CS4210 Final Exam Fall 2025

0.0(0)
Studied by 1 person
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/76

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 4:28 AM on 12/2/25
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

77 Terms

1
New cards

What is clustering?

Unsupervised grouping of data points based on similarity.

2
New cards

Why is clustering unsupervised?

Labels are unknown; structure must be discovered.

3
New cards

Key challenge in clustering

Clusters may not match meaningful categories.

4
New cards

Visual cluster identification

"Look for dense groups, separation gaps, shape."

5
New cards

Intuition behind k-means

Use centroids and iteratively improve assignments.

6
New cards

Why use centroids?

Centroids summarize the center of a group.

7
New cards

Four steps of k-means

Initialize → Assign → Update → Repeat.

8
New cards

Distance effect in k-means

Points join nearest centroid; boundaries depend on distance metric.

9
New cards

Manhattan vs Euclidean distance

Manhattan gives diamond shapes; Euclidean gives circular.

10
New cards

Choosing k—Elbow

Pick k where SSE reduction slows.

11
New cards

Choosing k—Silhouette

To choose k, compute the average silhouette score for different values of k and pick the k with the highest average score.

12
New cards

Choosing k—Domain knowledge

Choose based on real-world expectations.

13
New cards

Poor k-means cases

"Non-spherical clusters, varying sizes, outliers."

14
New cards

Maximal margin classifier

Hyperplane maximizing margin to nearest points.

15
New cards

Support vectors

Points that define the margin and boundary.

16
New cards

Meaning of margin

Distance from boundary to closest points.

17
New cards

Why maximize margin?

"Better generalization, less overfitting."

18
New cards

Hard-margin SVM objective

Minimize ||w||² with perfect separation constraints.

19
New cards

Purpose of constraints

Enforce correct classification with separation.

20
New cards

Soft-margin SVM

Allows violations with slack variables.

21
New cards

When soft-margin needed

Noisy or overlapping data.

22
New cards

Kernel trick

Implicit high-dimensional mapping via kernels.

23
New cards

Why kernels help

Enable nonlinear boundaries with linear models.

24
New cards

Kernel comparison

"Linear=simple, Polynomial=interactions, RBF=complex local."

25
New cards

Dual vs primal

Dual supports kernelization.

26
New cards

SVM decision function

"Sign(sum α_i y_i K(x_i, x) + b)."

27
New cards

Neural network definition

Layered function approximator with neurons.

28
New cards

NN structure

"Neurons, layers, weights."

29
New cards

NN data flow

Input → weighted sum → activation → output.

30
New cards

Forward propagation

Computing predictions layer by layer.

31
New cards

Loss function purpose

Quantify prediction error.

32
New cards

Purpose of hidden layers

Model nonlinear patterns.

33
New cards

What are weights?

Trainable parameters shaping the model.

34
New cards

Activation function

Nonlinear mapping enabling complexity.

35
New cards

Common activations

"ReLU, Sigmoid."

36
New cards

Input neurons with 3 features

Three neurons required.

37
New cards

What is training?

Forward + backward passes updating weights.

38
New cards

Why deeper networks help

More hierarchical representations.

39
New cards

Chain rule purpose

Differentiate composite functions.

40
New cards

Outer vs inner function

Outer wraps inner in composite expressions.

41
New cards

First chain rule step

Differentiate outer at inner.

42
New cards

Chain rule in NN

Propagates dependency through layers.

43
New cards

Backpropagation meaning

Gradient computation backward through layers.

44
New cards

Backward pass

Compute gradients for all weights.

45
New cards

How weights know what to change

Gradients show contribution to error.

46
New cards

Chain rule importance

Essential for deep gradient flow.

47
New cards

Updating weight steps

Compute error → backprop → gradient → update.

48
New cards

Convolution definition

Sliding filter computing weighted sums.

49
New cards

How CNNs learn

"Filters adapt to edges, textures, shapes."

50
New cards

Main CNN layers

"Conv, ReLU, Pooling, Fully-connected."

51
New cards

Purpose of max pooling

Reduce size while keeping key activations.

52
New cards

Pooling effect

Reduces spatial dimensions.

53
New cards

How RNN works

Uses hidden state for sequence memory.

54
New cards

Hidden state importance

Captures temporal dependencies.

55
New cards

Vanishing/exploding gradients

Gradients shrink or blow up across time.

56
New cards

Why LSTM/GRU help

Gates control memory retention.

57
New cards

CNN vs RNN on sequences

CNN = local features; RNN = temporal flow.

58
New cards

Model strengths—CNN vs RNN

CNN for spatial; RNN for sequence.

59
New cards

Convolution output formula

((W−F+2P)/S) + 1.

60
New cards

AlexNet conv output

Apply conv formula for 227x227 input.

61
New cards

Number of conv units

Filters × output width × height.

62
New cards

Conv parameters with sharing

Filter size × filters + biases.

63
New cards

FF parameter explosion

Flattening produces huge parameter count.

64
New cards

Max pooling effect

Downsamples by choosing max value.

65
New cards

Kernel feature types

"Edges, diagonals, sharpening."

66
New cards

Gradient flow effect

Gradients reach all layers.

67
New cards

Backprop layer relation

dL/dW = error × input activation.

68
New cards

Why backward pass needed

Forward predicts; backward updates.

69
New cards

Full CNN architecture example

Conv → Conv → Pool → Dense → Dense → Output.

70
New cards

Why stack conv layers

Deeper abstractions.

71
New cards

Purpose of downsampling

Reduce compute and focus on patterns.

72
New cards

Most fundamental NN idea

Composition of linear + nonlinear functions.

73
New cards

Essence of backprop

Backward gradient flow updates weights.

74
New cards

CNN vs FF

CNN uses shared filters; FF uses all connections.

75
New cards

SVM vs NN

Margin maximization vs error minimization.

76
New cards

Hard vs soft margin

Perfect separation vs tolerance.

77
New cards

Common k-means mistake

Using it on non-spherical clusters.