COMP5270 Assignment and Sample Exam Terminology Glossary

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/87

Earn XP

Description and Tags

Vocabulary terms and definitions from the COMP5270 assignment and exam terminology glossary version 2, covering notation, probability, randomized algorithms, hashing, graph optimization, linear programming, LSH, and streaming.

Last updated 4:49 AM on 6/17/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

88 Terms

1

New cards

$[n]$

The set $\{1, 2, ..., n\}$ . Used as the item universe, element universe, vertex labels, and domains for SetCover and sampling.

2

New cards

$S \subseteq T$

S is a subset of T, meaning every element of S is also in T.

3

New cards

$S \cup T$

Union of sets S and T; contains elements in S or T or both.

4

New cards

$S \setminus T$

Set difference; contains elements in S but not in T.

5

New cards

$\|S\|$

Cardinality, representing the number of elements in set S.

6

New cards

$x \in \{0,1\}$

A binary variable, usually representing yes/no or chosen/not chosen.

7

New cards

$Pr[A]$

The probability that event A occurs.

8

New cards

$E[X]$

Expectation, or average value, of random variable X.

9

New cards

$Var(X)$

Variance, a measure of spread around $E[X]$ .

10

New cards

$1\{\text{event}\}$

Indicator variable equal to 1 when the event happens and 0 otherwise.

11

New cards

$X \sim Bern(p)$

Bernoulli random variable equal to 1 with probability p.

12

New cards

$U \sim Unif(0,1)$

A uniformly random real number in the interval $(0,1)$ .

13

New cards

$G \sim N(0, Id)$

A standard Gaussian random vector in d dimensions.

14

New cards

u.a.r.

Uniformly at random; every possible choice has the same probability.

15

New cards

with replacement

Samples are independent and the same object can be sampled again.

16

New cards

independently

Random choices do not influence each other.

17

New cards

$O(f(n))$

An asymptotic upper bound used for time, space, sample, and query complexity.

18

New cards

$\Omega(f(n))$

An asymptotic lower bound.

19

New cards

$\Theta(f(n))$

A tight asymptotic bound, both $O$ and $\Omega$ .

20

New cards

$\epsilon$

Accuracy parameter; controls additive or multiplicative error in approximation.

21

New cards

$\delta$

Failure probability parameter, usually appearing inside $\log(1/\delta)$ repetitions or sketch size.

22

New cards

OPT

The value of the true optimal solution.

23

New cards

$OPT\_{LP}$

The optimum value of the LP relaxation, acting as a fractional benchmark before rounding.

24

New cards

x^\*

An optimal fractional LP solution.

25

New cards

$\langle x,g \rangle$

Inner product of vectors x and g.

26

New cards

$\|x-y\|\_2$

Euclidean distance between vectors x and y, used to distinguish near and far points.

27

New cards

hash function

A function mapping a large universe to a smaller range of buckets or values.

28

New cards

collision

Event where two distinct inputs receive the same hash value.

29

New cards

hash family

A set of hash functions from which one function is chosen randomly.

30

New cards

Bloom filter

A randomized dictionary structure with compact space and possible false positives, but no false negatives.

31

New cards

false positive

Occurs when the data structure says 'yes' even though the item is absent.

32

New cards

false negative

Occurs when the data structure says 'no' even though the item is present.

33

New cards

randomized algorithm

A randomized algorithm that may output a wrong answer with small probability.

34

New cards

Las Vegas algorithm

A randomized algorithm that always returns a correct answer but has random running time.

35

New cards

streaming algorithm

An algorithm that processes input items one at a time using space much smaller than the total stream.

36

New cards

one-pass model

The algorithm sees each stream item exactly once and cannot rewind.

37

New cards

reservoir sampling

Technique to maintain a uniform sample from a stream without knowing its final length in advance.

38

New cards

Bernoulli replacement rule

At time t, replacing a stored sample with probability $1/t$ or $k/t$ to maintain a uniform sample.

39

New cards

CountMinSketch

A frequency estimation sketch with one-sided overestimation guarantees.

40

New cards

CountSketch

A streaming sketch using random signs ( $\pm 1$ ) to estimate frequencies, suitable for heavy hitters.

41

New cards

Frequent Elements

The problem of finding elements in a stream whose counts are above a certain threshold.

42

New cards

Approximate Nearest Neighbour (ANN)

Task to return a point close to the query mapping, allowing an approximation factor C.

43

New cards

Hamming space

Space of strings or vectors where distance counts the number of differing coordinates.

44

New cards

Hamming distance

The number of coordinates where two strings or vectors differ.

45

New cards

Euclidean space

Vector space $\mathbb{R}^d$ equipped with Euclidean distance.

46

New cards

Locality-Sensitive Hashing (LSH)

Hashing where close points have a higher probability of collision than far points.

47

New cards

(r, C, p, q)-LSH family

An LSH family where distances that are $\le r$ collide with probability $\ge p$ , and distances that are $\ge Cr$ collide with probability $\le q$ .

48

New cards

sensitivity parameter $\rho$

A parameter controlling LSH space/query complexity, often $\rho = \log(1/p)/\log(1/q)$ .

49

New cards

approximation factor C

In LSH, how much farther than the true near radius r the algorithm may accept.

50

New cards

collision probability

The probability that $h(x) = h(y)$ . LSH makes this larger for near pairs.

51

New cards

Gaussian projection

Projecting vectors onto a random Gaussian direction to reduce Euclidean distance to one-dimensional analysis.

52

New cards

random shift

Adding a random value $U \sim Unif(0,1)$ before taking the floor in an LSH hash.

53

New cards

bucket width w

The interval width in the Euclidean LSH hash: $h\_{g,u}(x) = \lfloor \langle x,g \rangle / w + u \rfloor$ .

54

New cards

Baby ANN

A simpler ANN data structure used as a building block for the final Adult ANN.

55

New cards

doubling search

Trying geometrically increasing guesses for an unknown distance scale.

56

New cards

monotonic function

A function that either only increases or only decreases.

57

New cards

Taylor expansion

Approximation of a function by a power series, used to justify asymptotic behavior of $\rho$ .

58

New cards

random cut

A cut produced by assigning vertices to sides randomly; expected to cut half the edges.

59

New cards

MaxCut

The problem of partitioning vertices to maximise the number of crossing edges.

60

New cards

Min-Cut

The task of finding a cut with the minimum number or weight of crossing edges.

61

New cards

complete graph $K\_n$

A graph on n vertices where every pair of vertices is connected by an edge.

62

New cards

Minimum Spanning Tree

A minimum-weight tree connecting all vertices of a weighted connected graph.

63

New cards

independent set

A set of vertices in which no two vertices are connected by an edge.

64

New cards

integer linear program (ILP)

Optimization problem with linear objective, linear constraints, and integer variables.

65

New cards

feasible solution

An assignment of variables that satisfies all constraints of an optimization problem.

66

New cards

objective function

The quantity (such as cost or size) being maximised or minimised.

67

New cards

constraint

A condition every feasible solution must satisfy.

68

New cards

randomized rounding

Converting fractional LP values into random integer decisions using those values as probabilities.

69

New cards

SetCover

The problem of choosing sets whose union covers all elements while minimising total cost.

70

New cards

cost(C)

Total cost of the selected sets or objects.

71

New cards

Knapsack

NP-Hard problem to choose items under a budget/weight limit to maximise value.

72

New cards

NP-Hard

A class of problems at least as hard as the hardest problems in NP.

73

New cards

approximation factor

A measure of how close an algorithm's solution value is to the optimum.

74

New cards

coverage function

A function, often denoted $Q(S)$ , counting how many elements are covered by a chosen set.

75

New cards

constant-factor approximation

An approximation by a fixed constant factor independent of input size n.

76

New cards

Approximate Nearest Neighbour (ANN)

Task to return a point close to the query mapping, allowing an approximation factor C.

77

New cards

random bits

The finite random string used by an algorithm; if r bits are used, there are only $2^r$ executions.

78

New cards

random subset

A subset drawn from all possible subsets, usually uniformly.

79

New cards

random permutation

A uniformly random ordering of n distinct items.

80

New cards

shuffle algorithm

An algorithm for producing a random permutation.

81

New cards

sampling with replacement

Every draw is independent and can repeat earlier objects.

82

New cards

indicator variable

A 0/1 variable for whether an event occurs, used to count covered elements, collisions, or successes.

83

New cards

empirical average

The average of observed samples.

84

New cards

failure event

The bad event whose probability must be controlled.

85

New cards

amplification

Repeating independent trials to reduce failure probability.

86

New cards

with high probability

Success probability close to 1, often $1 - 1/poly(n)$ or a stated constant such as 99%.

87

New cards

constant probability

A success probability bounded away from 0, such as 1/2, 2/3, or 9/10.

88

New cards

union bound

$Pr[\text{any bad event}] \le \text{sum of individual bad-event probabilities}$ .