np-complete

NP-Completeness

Compiled by Alfan F. Wicaksono from multiple sources.

Credits

Introduction to Algorithms, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein
Slide kuliah DAA Pak L. Y. Stefanus (Pak Stef)

Tractable vs Intractable

Polynomial-time algorithms: On inputs of size $n$ , their worst-case running time is $O(n^k)$ for some constant $k$ .
Not all problems can be solved in polynomial time.
Some problems cannot be solved by any computer (Turing's Halting Problem).
Some problems can be solved, but not in polynomial time.
Tractable (easy) problems: solvable by polynomial-time algorithms.
Intractable (hard) problems: require superpolynomial time to be solved.

An Unsolved Problem: Turing’s Halting Problem

Given a computer program and an input, will the program terminate or run forever?
In 1936, Alan Turing proved that no Turing machine can decide correctly for all possible program/input pairs.
This is an undecidable problem.

NP-Complete Problems (NPC)

An interesting class of problems whose status is unknown.
No polynomial-time algorithm has been discovered for an NP-complete problem, and no one has proven that such an algorithm cannot exist.
Examples: Integer Knapsack, Hamiltonian Cycle, Set Cover Problem.

Why Learn This Topic?

Essential for becoming a good algorithm designer.
Establishing a problem as NP-complete provides evidence of its intractability.
Engineers should focus on approximation algorithms or tractable special cases instead of searching for a fast, exact algorithm.
Many practical problems are NP-complete.

Decision Problem vs Optimization Problem

Optimization problems: Each feasible solution has a value, and the goal is to find a feasible solution with the best value.
- SHORTEST-PATH: Input is an undirected graph $G$ and vertices $u$ and $v$ , and the goal is to find a path from $u$ to $v$ that uses the fewest edges.
- KNAPSACK: Input is knapsack capacity $W$ , and a set of items with their weights and values, and the goal is find a subset of items that fits the knapsack and maximizes the total value.
NP-completeness applies directly to decision problems, in which the answer is simply “yes” or “no” (1 or 0).
A given optimization problem can be cast as a related decision problem by imposing a bound on the value to be optimized.
- Decision problem related to SHORTEST-PATH is PATH: given an undirected graph $G$ , vertices $u$ and $v$ , and an integer $k$ , does a path exist from $u$ to $v$ consisting of at most $k$ edges?

Decision Problem vs Optimization Problem

The decision problem is “easier” or “no harder” than the optimization counterpart.
Evidence that a decision problem is hard also provides evidence that its related optimization problem is hard.
PATH $i = (G, u, v, k)$
SHORTEST-PATH $i = (G, u, v)$
$k' = k$

Decision Problem vs Optimization Problem Exercise

Cast an Integer Knapsack problem (KNAPSACK) as a decision problem (KNAPSACK-DEC)!

P vs NP

The class P consists of decision problems that are solvable in $O(n^k)$ for some constant $k$ (Polynomial).
The class NP consists of decision problems that are “verifiable” in polynomial time (Nondeterministic Polynomial).
If given a “certificate” of a solution, you could verify that the certificate is correct in time polynomial in the size of the input to the problem.

P vs NP

HAMILTON: A hamiltonian cycle of a directed graph $G = (V,E)$ is a simple cycle that contains each vertex in $V$ . Given an input $G$ , determine whether a directed graph has a hamiltonian cycle!
HAMILTON is NP
Let a certificate be a sequence $v1, v2, …, v{|V|}$ of $|V|$ vertices. We can check in polynomial time that the sequence contains each of the $|V|$ vertices exactly one, that $vi, v{i+1} ∈ E$ for $i = 1,2,3, … , |V| − 1$ , and that $v{|V|}, v_1 ∈ E$ .

P vs NP

Can every problem whose solution can be quickly checked (NP) also be solved quickly (P)?
$P ⊆ NP$
Any problem in $P$ also belongs to $NP$ , since if a problem belongs to $P$ then it is solvable in polynomial time without even being supplied a certificate.
The open question is whether $P$ is a proper subset of $NP$ . P ≠ NP (improper subset) or P = NP (proper subset)?
Many scientists believe that $P ≠ NP$ . Despite much effort, no one has proven that $P = NP$ .

NP-Complete

Informally, a problem belongs to the class NPC if:
- It belongs to NP; and
- It is as “hard” as any problem in NP
If any NPC problem can be solved in polynomial time, then every problem in NP has a polynomial-time algorithm.
Nobody has proven whether NPC problems can be solved in polynomial time.
The Clay Mathematics Institute is offering a US\$1 million reward to anyone who has a formal proof that P=NP or that P≠NP.

NP-Complete

Demonstrating that a problem is NPC is a statement about how hard it is, rather than about how easy it is.
Proving a problem NPC suggests that searching for an efficient algorithm is likely fruitless.
How to show that your problem is NP-Complete?

Reductions

The input to a particular problem is an instance of that problem.
One way to solve a problem is to recast it as a different problem -> reducing one problem to another.
Problem $Q$ is reducible to problem $Q'$ if any instance of $Q$ can be recast as an instance of $Q'$ , and the solution to the instance of $Q'$ provides a solution to the instance of $Q$ .
$Q'$ reduces to $Q$ .

Reductions

LIN-EQ: Find a solution to the linear equation $ax + b = 0$ ?
QUAD-EQ: Find a solution to the quadratic equation $ax^2 + bx + c = 0$ ?
LIN-EQ reduction $a, b$ to QUAD-EQ $a, b, c=0$
$x = -b/a$
$x = (-b ± \sqrt{b^2 - 4ac}) / (2a)$
If reduction can be run in polynomial time, then LIN-EQ is no harder than QUAD-EQ, or $LIN-EQ ≤ QUAD-EQ$ .
If we know that QUAD-EQ can be solved in polynomial time, then LIN-EQ can also be solved in polynomial time.

Reductions Exercise

MAX: Given an array, find a maximum value!
SORT: Given an array, output a sorted version of the array in ascending fashion!
We know that SORT can be solved in polynomial-time. Can you prove that MAX also has a polynomial solution? How do you prove?

Polynomial-time Reductions

Consider a decision problem A, which you would like to solve in polynomial time.
Now suppose that you already know how to solve a different decision problem B in polynomial time.

Polynomial-time Reductions

Suppose that you have a procedure that transforms any instance $α$ of A into some instance $β$ of B with the following characteristics:
- The transformation takes polynomial time;
- The answers are the same. That is, the answer for $α$ is yes if and only if the answer for $β$ is yes.
We call such a procedure a polynomial-time reduction algorithm. it provides us a way to solve problem A in polynomial time.

Polynomial-time Reductions

As long as each of these steps takes polynomial time, all three together do also, and so you have a way to decide on $α$ in polynomial time. In other words, by “reducing” solving problem A to solving problem B , you use the “easiness” of B to prove the “easiness” of A.
B is easy --> A is easy

Polynomial-time Reductions

We can use the contraposition to prove that no polynomial solution can exists for B, when we know that A is “hard”.
A is hard --> B is hard.
Proof: Suppose B has a polynomial-time algorithm. Then, according to the below picture, we would have a way to solve problem A in polynomial time, which contradicts our assumption that there is no polynomial time solution for A.

Then, how to prove that your problem is NP-Complete?

Yes, the previous slide serves as a basis for us to prove that a particular problem is NPC.
Suppose that A is known to be NPC;
If A is reducible to B in polynomial time, then we have shown that B is also NPC.
But, what is the first problem proved to be NPC?

A First NP-Complete Problem

Because the technique of reduction relies on having a problem already known to be NPC in order to prove a different problem NP-complete, there must be some first NPC problem.
We’ll use the circuit-satisfiability problem (CIRCUIT-SAT), in which the input is a boolean combinational circuit composed of AND, OR, and NOT gates, and the question is whether there exists some set of boolean inputs to this circuit that causes its output to be 1.

What Wikipedia Says?

The concept of NP-completeness was introduced in 1971 (see Cook-Levin theorem), though the term NP-complete was introduced later. At the 1971 STOC conference, there was a fierce debate between the computer scientists about whether NP-complete problems could be solved in polynomial time on a deterministic Turing machine. John Hopcroft brought everyone at the conference to a consensus that the question of whether NP-complete problems are solvable in polynomial time should be put off to be solved at some later date, since nobody had any formal proofs for their claims one way or the other. This is known as \"the question of whether P=NP\".

Nobody has yet been able to determine conclusively whether NP-complete problems are in fact solvable in polynomial time, making this one of the great unsolved problems of mathematics. The Clay Mathematics Institute is offering a US\$1 million reward to anyone who has a formal proof that P=NP or that P#NP.[5]

The existence of NP-complete problems is not obvious. The Cook-Levin theorem states that the Boolean satisfiability problem is NP-complete, thus establishing that such problems do exist. In 1972, Richard Karp proved that several other problems were also NP-complete (see Karp's 21 NP-complete problems); thus, there is a class of NP-complete problems (besides the Boolean satisfiability problem). Since the original results, thousands of other problems have been shown to be NP-complete by reductions from other problems previously shown to be NP-complete; many of these problems are collected in Garey & Johnson (1979).

Formal Way to Understand What a “Problem” is

Yes, this is where you find the relation between DAA and TBA.

What a “problem” is?

We define an abstract problem $Q$ to be a binary relation on a set $I$ of problem instances and a set $S$ of problem solutions.
Example: PATH problem
- $i ∈ I$ , with $i = (G, u, v, k)$ , meaning that the input is a graph $G$ , a start node $u$ , an end node $v$ , and a value $k$ .
- $S = {0, 1}$
- $PATH(i) = 1$ if $G$ contains a path from $u$ to $v$ with at most $k$ edges;
- $PATH(i) = 0$ otherwise.

Encodings

In order for a computer program to solve an abstract problem, its problem instances must appear in a way that the program understands.
An encoding of a set $S$ of abstract objects is a mapping $e$ from $S$ to the set of binary strings.
Example, natural numbers ${0, 1, 2, …}$ are encoded as the strings ${0, 1, 10, 11, 100, …}$ .
Polygons, graphs, functions, ordered pairs, programs - all can be encoded as binary strings.

Encodings

We call a problem whose instance set is the set of binary strings a concrete problem.
We say that an algorithm solves a concrete problem in $O(T(n))$ time if, when it is provided a problem instance $i$ of size $n = |i|$ (the length of binary string), the algorithm can produce the solution in $O(T(n))$ .
Formally, A concrete problem is polynomial-time solvable, it there exists an algorithm to solve it in $O(n^k)$ time.

Encodings

Encodings map abstract problems to concrete problems.
Given an abstract decision problem $Q$ mapping an instance set $I$ to ${0,1}$ , an encoding $e : I → {0,1}*$ can induce a related concrete decision problem, which we denote by $e(Q)$ .
If the solution to an abstract-problem instance $i ∈ I$ is $Q(i) ∈ {0,1}$ , then solution to the concrete problem instance $e(i) ∈ {0,1}*$ is also $Q(i) ∈ {0,1}$ .

Formal-Language Framework

In the formal-language framework, the set of instances for any decision problem Q is simply the set Σ, where Σ = {0, 1}. Since Q is entirely characterized by those problem instances that produce a 1 (yes) answer, we can view Q as a language L over Σ = {0, 1}, where L = {x = Σ: Q(×) = 1}.

The formal-language framework allows us to express the relation between decision problems and algorithms that solve them concisely.

We say that an algorithm A accepts a string x = {0, 1}* if, given input x, the algorithm's output A(x) is 1.

➤ The language L accepted by an algorithm A is the set of strings that the algorithm accepts, that is, L = {x = {0, 1}*: A(×) = 1}.

➤ An algorithm A rejects a string × if A(x) = 0.

Formal-Language Framework Continued

Even if language L is accepted by an algorithm A, the algorithm will not necessarily reject a string × € L given as input to it. For example, the algorithm may loop forever.

A language L is decided by an algorithm A if every binary string in L is accepted by A and every binary string not in L is rejected by A.

Thus, to accept a language L, an algorithm need only worry about strings in L, but to decide a language, it must correctly accept or reject every string in {0, 1}*.

There are problems, such as Turing's Halting Problem, for which there exists an accepting algorithm, but no decision algorithm exists.

Using this formal-language framework, we can define the complexity class P as follows:

P = {L ≤ {0, 1}* : there exists an algorithm A that decides L in polynomial time}.

Hamiltonian Cycles

A hamiltonian cycle of an undirected graph G = (V, E) is a simple cycle that contains each vertex in V.

A graph that contains a hamiltonian cycle is said to be hamiltonian; otherwise, it is nonhamiltonian.

We can define the hamiltonian-cycle problem, "Does a graph G have a hamiltonian cycle?" as a formal language:

HAM-CYCLE = { : G is a hamiltonian graph}.

denotes the standard binary encoding of G.

In the following slide, the dodecahedron is hamiltonian, while the bipartite graph (with an odd number of vertices) is nonhamiltonian.

Hamiltonian Cycles Examples

Examples provided with images of hamiltonian and non-hamiltonian graphs

Verification Algorithms

Suppose that a friend tells you that a given graph G is hamiltonian, and then offers to prove it by giving you the vertices in order along the hamiltonian cycle.

➤ It would certainly be easy enough to verify the proof: simply verify that the provided cycle is hamiltonian by checking whether it is a permutation of the vertices of V and whether each of the consecutive edges along the cycle actually exists in the graph.

This verification algorithm can be implemented to run in O(n²) time, where n is the length of the encoding of G.

Thus, a proof that a hamiltonian cycle exists in a graph can be verified in polynomial time.

Verification Algorithms Continued

In general, a verification algorithm is defined as being a two-argument algorithm A, where one argument is an ordinary input string x and the other is a binary string y called a certificate.

A two-argument algorithm A verifies an input string x if there exists a certificate y such that A(x, y) = 1.

The language verified by a verification algorithm A is

L = {× = {0, 1}: there exists y = {0, 1} such that A(x, y) = 1}.

Intuitively, an algorithm A verifies a language L if for any string × = L, there is a certificate y that A can use to prove that x = L. Moreover, for any string x & L, there must be no certificate proving that x = L.

Certificate Size

$L = {x ∈ {0, 1}* : there exists a certificate y with |y| = O(|x|^c) such that A(x,y) = 1}$

The complexity class NP

HAM-CYCLE = NP.

If LE P, then L = NP.

Why? Because if there is a polynomial-time algorithm to decide
L, the algorithm can be easily converted to a two-argument
verification algorithm that simply ignores any certificate and accepts exactly those input strings it determines to be in L.

Thus, P ≤ NP.

It is unknown whether P = NP.

Most computer scientists believe that P # NP.

Intuitively, the class P consists of problems that can be solved quickly. The class NP consists of problems for which a solution can be verified quickly.

NP Completeness & Reducibility

➤ A language L₁ is polynomial-time reducible to a language L₂,
denoted as L₁ ≤p L₂, if there exists a polynomial-time computable
function f: {0, 1}* → {0, 1}* such that for all x = {0, 1}*,

× Є L₁ if and only if f(x) = L₂.

The function f is called the reduction function, and a polynomial-
time algorithm F that computes f is called a reduction algorithm.

Reduction Diagram

Shows a diagram with L1 reducing to L2 via function f.

NP Completeness & Reducibility

The reduction function f maps any instance x of the decision
problem represented by the language L₁ to an instance f(x) of
the decision problem represented by L₂.

Giving an answer to whether f(x) = L₂ directly provides the
answer to whether x = L₁.

Polynomial-time reductions provide a formal means for showing
that one problem is at least as hard as another, to within a
polynomial-time factor.

That is, if L₁ ≤p L₂, then L₁ is not more than a polynomial factor
harder than L₂.

Definition of NP-Completeness

Definition:
A language L {0, 1}* is NP-complete if

L = NP, and
L' ≤p L for every L' = NP.

This part is honestly difficult!

➤ If a language L satisfies property 2, but not necessarily
property 1, we say that L is NP-hard.

If any NP-complete problem is polynomial-time solvable, then
P = NP. Equivalently, if any problem in NP is not polynomial-time
solvable, then no NP-complete problem is polynomial-time
solvable.

Most computer scientists
believe that P # NP.

Good News!

Thanks to Prof. Cook and Prof. Karp who have shown our first NPC problem: Circuit Satisfiability Problem (CIRCUIT-SAT).

Both of you deserved Turing Award for this work!

Good News! (Circuit-SAT Reduction)

If we know CIRCUIT-SAT is NPC, then we can use it to prove “many” other problems to be NPC using polynomial-reduction.

If CIRCUIT-SAT is reducible to B, then B is NPC

Good News! Continued reductions

If B is reducible to C, then C is NPC.
If CIRCUIT-SAT is reducible to B, then B is NPC

Good News! Even More Reductions

If B is reducible to C, then C is NPC
If C is reducible to D, then D is NPC
And so on …

If CIRCUIT-SAT is reducible to B, then B is NPC

NP-Completeness Chart

Shows a chart of various NP-complete problems and their relationships.

Intro. To CIRCUIT-SAT

Our first problem proven to be NP-Complete!

Up to this point…

Up to this point, we have not actually proved any problem to be
NPC.

Once we prove that at least one problem is NPC, we can use
polynomial-time reducibility as a tool to prove the NP-
completeness of other problems.

Soon, we will demonstrate the existence of an NPC problem: the
circuit-satisfiability problem.

First let's review some related concepts of combinational
circuits.

A boolean combinational circuit consists of one or more logic
gates (NOT, AND, OR) interconnected by wires. Boolean
combinational circuits contain no cycles.

A truth assignment for a boolean combinational circuit is a set
of boolean input values. A one-output boolean combinational cir-
cuit is satisfiable if it has a satisfying assignment, that is, a
truth assignment that causes the output of the circuit to be 1.

Satisfiable vs Unsatisfiable Circuits

Provides examples of satisfiable and unsatisfiable circuits.

The circuit-satisfiability problem

➤ The circuit-satisfiability problem is
"Given a boolean combinational circuit composed of NOT, AND,
and OR gates, is it satisfiable?"

This problem arises in the area of computer-aided hardware
optimization. If a subcircuit always produces 0, that subcircuit
can be replaced by a simpler subcircuit that omits all logic gates
and provides the constant O value as its output.

The size of a boolean combinational circuit is the number of logic
gates plus the number of wires in the circuit.

➤ As a formal language, the circuit-satisfiability problem can be
defined as
CIRCUIT-SAT = { : C is a satisfiable boolean combinational
circuit}

where denotes the binary string encoding the circuit C.

Determining Satisfiability

Given a circuit C, we might attempt to determine whether it is
satisfiable by simply checking all possible assignments to the
inputs. Unfortunately, if there are k inputs, there are 2k
possible assignments.

We will prove that CIRCUIT-SAT is NP-complete.

The proof is divided into two parts, based on the two parts of
the definition of NP-completeness.

Lemma 1:
The circuit-satisfiability problem belongs to the class NP.

Lemma 2:
The circuit-satisfiability problem is NP-hard.

Theorem:
The circuit-satisfiability problem is NP-complete.
We provide the proof of this theorem in the last slides

CIRCUIT-SAT and Proving Other Problems NPC

How to use CIRCUIT-SAT to prove that other problems are NPC?

Proving Problems NP-Complete

➤ A language can be proved to be NP-complete without directly
reducing every language in NP to the given language.

The basis of the method is the following lemma.

Lemma:
If L is a language such that L' ≤p L for some L' = NPC, then L is
NP-hard. Moreover, if L = NP, then L = NPC.

Proof:
Since L' is NP-complete, for all L" = NP, we have L" ≤p L'. By
transitivity, since L' ≤p L, we have L" ≤p L, which shows that L is
NP-hard. If L = NP, we also have L = NPC.

Note that by reducing a known NP-complete language L' to L, we
implicitly reduce every language in NP to L.

Method for Proving NP-Completeness

The previous lemma gives us a method for proving that a
language L is NP-complete:

Prove L = NP.
Select a known NP-complete language L'.
Describe an algorithm that computes a function f mapping every
instance x = {0, 1}* of L' to an instance f(x) of L.
Prove that the function f satisfies × = L' if and only if f(x) = L
for all × = {0, 1}*.
Prove that the algorithm computing f runs in polynomial time.

Steps 2-5 show that L is NP-hard.

Knowing that CIRCUIT-SAT is NP-complete now allows us to
prove much more easily that other problems are NP-complete.

General Strategy for Proving NP-Completeness

Given a new problem X, here is the basic strategy for
proving that it is NP-complete.

Prove that X = NP.
Choose a problem y that is known to be NP-complete.
Prove that y ≤p X.

Example: Formula Satisfiability Problem (SAT)

We introduce a new problem, formula satisfiability problem (SAT). And we show that SAT is NPC using our “almighty” CIRCUIT-SAT

Formula Satisfiability Problem (SAT)

The (formula) satisfiability problem can be formulated as the
language SAT as follows.

➤ An instance of SAT is a boolean formula & composed of

n boolean variables: ×₁, X2, … ×ni
m boolean connectives: any boolean function with one or two
inputs and one output, such as ^ (AND), v (OR), ¬ (NOT),
→ (implication), ↔ (if and only if); and
parentheses.

The satisfiability problem asks whether a given boolean formula + is satisfiable.

SAT = {<Φ>: Φ is a satisfiable boolean formula}.

A boolean formula & can be encoded in a length that is poly- nomial in n + m.

Formula Satisfiability Problem (SAT) Example

➤ For example, the formula
¢ = ((×1 → ×2) v ¬ ((¬ ×₁ ↔ ×3) v ×4)) ^ ¬ ×2
has the satisfying assignment
(×₁ = 0, x₂ = 0, x3 = 1, x4 = 1).

Thus, this formula & belongs to SAT.

Theorem:
The problem of satisfiability of boolean formulas is
NP-complete.

Proving SAT is NPC

How to prove that SAT is NPC?

To prove that SAT belongs to NP, we show that a certificate
consisting of a satisfying assignment for an input formula & can be
verified in polynomial time.

The verifying algorithm simply replaces each variable in the
formula with its corresponding value and then evaluates the
expression. This task can be done in polynomial time. If the
expression evaluates to 1, the formula is satisfiable.

Proving SAT is NPC Continued

How to prove that SAT is NPC?

To prove that SAT is NP-hard, we show that CIRCUIT-SAT ≤p
SAT.

Induction can be used to express any boolean combinational circuit
as a boolean formula. We simply look at the gate that produces the
circuit output and inductively express each of the gate's inputs as
formulas. The formula for the circuit is then obtained by writing an
expression that applies the gate's function to its inputs' formulas.

For each wire x; in the circuit C, the formula & has a variable x;. The proper operation of a gate can be expressed as a formula involving the variables of its incident wires. For example,

Gate Operation as a Formula

the operation of the output AND gate is
×10 ↔ (×7 ^ ×8 ^ X9).

The formula & produced by the reduction algorithm is the AND
of the circuit-output variable with the conjunction of clauses
describing the operation of each gate. For example, the formula
for the circuit in the figure is

Satisfiability Equivalence

The circuit C is satisfiable exactly when the formula is satisfiable. Why? If C has a satisfying assignment, each wire of the circuit has a well- defined value and the output of the circuit is 1.

Formula Evaluation

Therefore, the assignment of wire values to variables in & makes each clause of ☀ evaluate to 1, and thus the conjunction of all evaluates to 1.

Conversely, if there is an assignment that causes & to evaluate to 1, the circuit C is satisfiable by an analogous argument. Thus, we have shown that CIRCUIT-SAT ≤p SAT, which completes the proof.

Various NP-complete Problems

List of NP-Complete Problems:

the 3-CNF satisfiability problem
the clique problem
the vertex-cover problem
The independent-set problem
the hamiltonian-cycle problem
the traveling-salesman problem
the subset-sum problem.
and many more…

NP-Completeness Chart Expanded

Shows an expanded chart of various NP-complete problems and their relationships.

Independent Set & Vertex Cover Problems

Independent Set

For a graph G = (V, E), we say a set of nodes S≤ V is independent if no two nodes in S are joined by an edge.

It is easy to find small independent sets in a graph (for example, a single node forms an independent set); the hard part is to find a largest independent set.

For example, in the following figure, the set of nodes {3, 4, 5} is an independent set of size 3, while the set of nodes {1, 4, 5, 6} is a largest independent set.

Independent Set Problem Definition

➤ As a decision problem, the independent set problem is formulated as follows: Given a graph G and a number k, does G contain an independent set of size at least k?

➤ As an optimization problem, the independent set problem is formulated as follows: Given a graph G, find an independent set of maximum size.

➤ From the point of view of polynomial-time solvability, the optimization version is equivalent to the decision version.

Optimization vs Decision Version

➤ Given a method to solve the optimization version, we automatically solve the decision version (for any k) as well.

But there is also a converse to this: If we can solve the decision version for every k, then we can also find a maximum independent set.

➤ For given a graph G of n nodes, we simply solve the decision version for each k; the largest k for which the answer is "yes" is the size of the largest independent set in G.

➤ By using binary search, we need only solve the decision version for O(log n) different values of k.

Vertex Cover

Given a graph G = (V, E), we say that a set of nodes S ≤ V is a vertex cover if every edge e = E has at least one end in S.

It is easy to find large vertex covers in a graph (for example, the full vertex set is one); the hard part is to find a smallest vertex cover. We formulate the Vertex Cover

For example, in the graph below, the set of nodes {1, 2, 6, 7} is a vertex cover of size 4, while the set {2, 3, 7} is a vertex cover of smallest size.

Vertex Cover Problem

➤ As a decision problem, the vertex cover problem is formulated as follows: Given a graph G and a number k, does G contain a vertex cover of size at most k?

➤ As an optimization problem, the vertex cover problem is formulated as follows: Given a graph G, find a vertex cover of minimum size.

Polynomial Reduction Example

➤ Nobody knows how to solve either Independent Set problem or Vertex Cover problem in polynomial time; but their relative difficulty (hardness) can be determined.

It can be shown that
Independent Set Sp Vertex Cover
and
Vertex Cover ≤p Independent Set
by using the following theorem.

Theorem: Independent Set and Vertex Cover

➤ Theorem: Let G = (V, E) be a graph. Then SV is an independent set if and only if its complement V - S is a vertex cover.

➤ Proof: First, suppose that S is an independent set. Consider an arbitrary edge e = (u,v). Since S is independent, it cannot be the case that both u and v are in S; so one of them must be in V - S. It follows that every edge has at least one end in V - S,