1/27
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
L7a Approximation algorithm
An algorithm that runs in polynomial (fast) time and is guaranteed to return a solution within a fixed factor of the best possible answer (OPT = the optimal value). Used for NP-hard problems, where finding the exact optimum quickly is believed impossible.
L7a Approximation factor (ratio)
How close the answer is to OPT (the optimum). MINIMISATION: factor α means answer ≤ α·OPT (α ≥ 1, smaller = better). MAXIMISATION: factor means answer ≥ (factor)·OPT (e.g. 0.75 = at least 75% of OPT). "Constant factor" = the factor is a fixed number; "super-constant" = it grows with input size (e.g. O(log n)).
L7a Vertex Cover — factor-2 approximation
A vertex cover is a set of vertices that includes at least one endpoint of EVERY edge. Algorithm: repeatedly take any uncovered edge, add BOTH its endpoints to the cover, delete the edges they cover; repeat. Result is at most 2× the smallest possible cover (constant factor), and 2 is believed best possible. To pick a valid cover in MCQs: choose the option whose vertices touch every edge.
L7a Set Cover — greedy O(log n) approximation
Goal: cover a universe U of elements using as few of the given sets as possible. Greedy rule: each step, pick the set covering the MOST currently-uncovered elements; repeat until all covered. Uses at most O(log n) times the minimum number of sets (n = number of elements); that factor is the best possible unless P=NP.
L7a Knapsack problem
Input: n items, each with integer weight w_i ≥ 0 and integer value v_i ≥ 0, and a knapsack of integer capacity C. Goal: choose items with total weight ≤ C that maximise total value. Assumption: every w_i ≤ C. Notation: v_max = the largest single value; total of all values ≤ n·v_max. Knapsack is NP-hard.
L7a DynamicKs (exact dynamic programming)
Define T(i, v) = the minimum knapsack capacity (weight) needed to reach total value at least v, using only items 1…i. Recurrence: T(i,v) = 0 if v ≤ 0; = ∞ (impossible) if i = 0 and v > 0; otherwise min{ T(i−1, v) [skip item i], w_i + T(i−1, v−v_i) [take item i] }. Optimal value = the largest v with T(n, v) ≤ C. Always finds the true optimum.
L7a DynamicKs runtime (pseudo-polynomial)
Runtime is O(n²·v_max). This is NOT polynomial in the input SIZE: a number as big as v_max needs only about ⌈log₂ v_max⌉ bits to write, so v_max is exponential in the input length. Fast only when values are "small" (polynomial in n). This is why we scale the values down.
L7a DynamicAppKs (approximation by scaling)
Idea: shrink the values so DynamicKs runs fast. Step 1: scale each value to v_i / θ (θ = scaling factor), then round UP to an integer (⌈x⌉ = round x up) to get the "scaled value". Step 2: run exact DynamicKs on these small scaled values and return the items it picks. Dividing every value by θ keeps the same best set; rounding up only adds a small, bounded error.
L7a θ (the scaling factor)
θ = ε·v_max / n, where ε ∈ (0,1) is the chosen precision, v_max = largest original value, n = number of items. Trade-off: larger θ → smaller values → faster but more rounding error; smaller θ → slower but more accurate. To scale one item: scaled value = ⌈ v_i / θ ⌉, i.e. divide the value by θ then ALWAYS round up.
L7a DynamicAppKs guarantee
The value returned is at least OPT·(1 − ε), so it is a (1−ε)-approximation — you can get as close to the optimum as you like by choosing a small ε. Runtime is O(n³/ε), which is polynomial for every fixed ε.
L7a PTAS (Polynomial Time Approximation Scheme)
An algorithm that, for ANY constant ε > 0, returns a solution within a factor (1 ± ε) of the optimum, in time polynomial in n for each fixed ε (the dependence on ε itself may be anything, e.g. O(n/ε)). Knapsack has a PTAS, so a (1−ε)-approx exists for every ε (ε=0.25 gives a 0.75-approx). But since Knapsack is NP-hard, no polynomial-time EXACT algorithm exists unless P=NP.
L7a Knapsack DynamicAppKs scaling: ε=0.01, item 1 value = 78000, item 2 value = 7982, n=2 items. What is the scaled value of item 2?
21 — v_max = 78000 (largest value). θ = ε·v_max/n = 0.01×78000/2 = 390. Scaled value of item 2 = ⌈7982/390⌉ = ⌈20.47⌉ = 21. You ALWAYS round up.
L7a Which statement about the Knapsack problem is TRUE?
A 0.75-approximation algorithm for Knapsack exists — Knapsack has a PTAS, so a (1−ε)-approx exists for any ε (ε=0.25 gives 0.75). An exact polynomial algorithm would mean P=NP, because Knapsack is NP-hard.
L7a Set cover: U={1,2,3,4,5}, S1={2,3,4}, S2={1,2}, S3={1,5}, S4={4,5}. Which sets does greedy pick in iterations 1 and 2?
S1 and S3 — Greedy picks the set covering the most UNcovered elements. 1st: S1 covers 3 new (2,3,4); uncovered now {1,5}. 2nd: S3 covers 2 new (1 and 5), beating S4 and S2 which add only 1 new each.
L7b Travelling Salesperson Problem (TSP)
Input: a complete weighted undirected graph G = (V, E, w) — V = vertices ("cities"), E = an edge between every pair, w(e) ≥ 0 = the weight/distance of edge e. Goal: find a cycle (tour) that visits every vertex exactly once and returns to the start, with minimum total edge weight (sum of the weights of the edges used).
L7b General TSP is inapproximable
With arbitrary weights, no polynomial-time algorithm can approximate TSP within ANY factor you can compute in polynomial time (even something huge like 2ⁿ), unless P=NP. A good approximation is only possible when the weights have extra structure.
L7b Triangle inequality
For any three vertices u, v, x: w(u,v) ≤ w(u,x) + w(x,v). In words: the direct edge between two vertices is never longer than going via a third vertex — there are no useful "shortcuts" through other vertices.
L7b Metric TSP
TSP where the edge weights satisfy the triangle inequality. Still NP-hard, but unlike general TSP it has a constant-factor approximation. Example: points in the plane with straight-line (Euclidean) distances automatically satisfy the triangle inequality, so they are metric TSP instances.
L7b Minimum Spanning Tree (MST)
A subset of the edges that connects all vertices, has no cycle, and has the smallest possible total edge weight. Can be computed in polynomial time. It is the starting backbone for the TSP approximation.
L7b Eulerian tour (cycle)
A closed walk that uses every edge of a (multi-)graph exactly once (you may pass through the same vertex more than once). It exists if and only if the graph is connected and every vertex has even degree (an even number of edges). Can be found in linear time.
L7b FindTour — Steps 1 and 2
Step 1: compute an MST T of the graph. Step 2: build a multigraph H by DUPLICATING every edge of T (each edge now appears twice). Doubling makes every vertex have even degree, which guarantees an Eulerian tour exists in H.
L7b FindTour — Steps 3 and 4
Step 3: find an Eulerian tour of H (a closed walk using each doubled edge once). Step 4: "shortcutting" — walk along the Eulerian tour but skip any vertex already visited (keep only the first appearance of each vertex). The result visits every vertex exactly once = a valid TSP tour.
L7b FindTour result
FindTour is a 2-approximation algorithm for Metric TSP: it runs in polynomial time and returns a tour of total weight at most 2 × OPT (OPT = weight of the best possible tour).
L7b Why weight(MST) ≤ OPT
Take the optimal tour and delete any one edge. What's left is a path through all vertices, which is a spanning tree, weighing ≤ OPT (it has one fewer positive-weight edge). The MST is the lightest spanning tree of all, so weight(MST) ≤ weight(that path) ≤ OPT.
L7b Why FindTour ≤ 2·OPT
The Eulerian tour uses each MST edge twice, so its weight = 2 × weight(MST) ≤ 2 × OPT. Shortcutting replaces a detour through an already-visited vertex with a direct edge, which by the triangle inequality never increases weight. So final tour ≤ Euler tour weight = 2 × MST ≤ 2 × OPT.
L7b TSP where vertices are points in the plane and each edge weight is the straight-line distance between them. Is there a constant-factor approximation algorithm?
Yes — straight-line (Euclidean) distances obey the triangle inequality, so this is metric TSP, which has a constant-factor (2-)approximation. (Only general, non-metric TSP is inapproximable.)
L7b Modified FindTour: graph H = ONE copy of the MST edges plus extra edges of total weight at most p×OPT (an Eulerian tour still exists). Which statement is true?
If p=0.5 it is a 1.5-approximation — weight(H) = MST (≤ OPT) + extra (≤ p·OPT) ≤ (1+p)·OPT, and shortcutting (which needs the triangle inequality) never adds weight, so it is a (1+p)-approximation. p=0.5 → 1.5.
L7b In the proof that FindTour returns a tour of weight ≤ 2×OPT, why is weight(MST) ≤ OPT?
Deleting one edge from the optimal tour gives a path through all vertices (a spanning tree) of weight ≤ OPT. The MST is the minimum-weight spanning tree, so MST ≤ that path ≤ OPT.