Study Notes on Greedy Algorithms and Minimum Spanning Trees

Analysis of Algorithms

Greedy Algorithms


Greedy Technique

  • Constructs a solution to an optimization problem piece by piece through a sequence of choices that are:

    • Feasible: Each choice made must satisfy the problem constraints.

    • Locally Optimal: Each choice must be the best among the available options at that point in time.

    • Irrevocable: Once a choice is made, it cannot be undone.

  • For some problems, the greedy algorithm yields an optimal solution for every instance.

  • For most problems, it does not yield an optimal solution but can provide fast approximations.


Applications of the Greedy Strategy

Optimal Solutions
  • Minimum Spanning Tree (MST): A spanning tree with the smallest possible total edge weight.

  • Single-Source Shortest Paths: Finding shortest paths from a single source to all other vertices.

  • Simple Scheduling Problems: Problems involving scheduling tasks with specific constraints.

  • Huffman Codes: Used in data compression algorithms.

Approximations
  • Traveling Salesman Problem (TSP): Finding an optimal route that visits each city exactly once.

  • Knapsack Problem: Selecting a set of items to maximize total value without exceeding the weight limit.

  • Other Combinatorial Optimization Problems: Various problems where combinatorial properties make collating optimal choices complex.


Minimum Spanning Trees (MST)

Definitions
  • Spanning Tree: A tree that connects all the vertices of a graph and is acyclic.

  • Minimum Spanning Tree: The spanning tree that has the minimum sum of weights across all the edges.

  • Spanning Forest: Consists of spanning trees for each connected component if the graph is not connected.

Practical Applications of MST
  • Finding the least expensive way to connect a set of cities, terminals, computers, etc.


Example Problem

  • Scenario: A town has a set of houses and a set of roads.

  • Road Connections: A road connects exactly two houses; the cost associated with repairing each road is denoted as w(u, v), where u and v are the connected houses.

  • Goal: Repair enough roads such that:

    1. All houses are connected (can reach one house from any other).

    2. The total cost of repairs is minimized.


Properties of Minimum Spanning Trees

  • The minimum spanning tree is not unique; there might be several trees with the same minimum weight.

  • An MST contains no cycles; removing an edge from any cycle will still keep the graph connected while reducing the overall weight.

  • The number of edges in a minimum spanning tree is given by:
    |E| = |V| - 1


Growing a MST – Generic Approach

  1. Start with an empty set of edges, A.

  2. Incrementally add edges to A that belong to a minimum spanning tree.

  3. A new edge (u, v) is considered safe for A if adding it will not create a cycle; in other words, A
    {(u, v)} must remain a subset of some MST.


Generic MST Algorithm

  1. Initialize A as an empty set.

  2. While A is not a spanning tree, repeat:

    • Find a safe edge (u, v) for A.

    • Add the edge to A: A
      {(u, v)}.

  3. Return A.


Finding Safe Edges

  • Consider edge (h, g) to check if it is initially safe for A.

  • Define a set S that includes vertex h but not vertex g (hence g is in V - S).

  • In any MST, there exists at least one edge that connects S to V - S. Therefore, it is optimal to choose the edge with minimum weight connecting these sets.


Definitions

  • Cut (S, V - S): A way of partitioning the vertices into disjoint sets S and V - S.

  • An edge crosses the cut if one endpoint is in S and the other in V - S.

Further Definitions
  • A cut respects a set A of edges if no edge in A crosses the cut.

  • An edge is considered a light edge crossing a cut if it has the minimum weight among all edges crossing that cut.


Theorem (Light Edge Condition)

  • Let A be a subset of some MST T, and let (S, V - S) be a cut respecting A. If (u, v) is a light edge crossing (S, V - S), then (u, v) is safe for A.

Proof Summary
  • Suppose T includes the edges in A. If it also includes (u, v), then (u, v) is inherently safe. If (u, v) is not included, an alternative tree can be formed by including (u, v) and removing an edge from the cut, leaving the new tree T’ still satisfying MST properties.


Prim’s Algorithm

  • Prim's algorithm is a greedy algorithm starting from an arbitrary vertex selected as the root.

  • The algorithm continually finds light edges crossing the boundary between the included vertices and those not yet included in the tree.

  • Process repeat until every vertex is included in the tree.

Efficient Edge Selection
  • Use a priority queue Q to manage vertices that have not yet been included in the tree.

  • Assign a key value to each vertex representing the minimum edge weight connecting it to the included tree.


Example Walkthrough of Prim’s Algorithm

  1. Initialize a priority queue Q with all unvisited vertices.

  2. Extract the minimum from Q and initialize it as a vertex in the MST. Update priorities for adjacent vertices accordingly.

  3. Continue this process until Q is empty.


Prim's Algorithm Operations

  • Time Complexity:

    • If implemented using a binary heap, time complexity is O(V ext{log} V + E ext{log} V).

    • If implemented using a Fibonacci heap, time complexity reduces to O(V ext{log} V + E).


Kruskal’s Algorithm

  • Contrasts with Prim's algorithm by growing multiple trees into a forest and merging them based on light edges.

  • Start with each vertex as its own component and repeatedly merge components using the lightest edges until only one component remains (the MST).

    • Operations are facilitated by a disjoint-set (union-find) data structure.


Operations on Disjoint Data Sets

  1. MAKE-SET(u): Create a new set containing only u.

  2. FIND-SET(u): Return a representative element from the set containing u.

  3. UNION(u, v): Combine the sets containing u and v into one.

Performance
  • Time complexity for both FIND-SET and UNION can be made very efficient using path compression and union by rank, making them effectively constant time.


Implementation of Kruskal’s Algorithm

  1. Initialize an empty set A.

  2. For each vertex, create a set for it using MAKE-SET.

  3. Sort edges in non-decreasing order of weights.

  4. Iterate through the sorted edge list; if the vertices of an edge are in different sets, add it to A, then perform UNION.

  5. Return A after processing all edges.

    • Time complexity is O(E ext{log} E) where E is the number of edges.


Comparison of Prim’s and Kruskal’s Algorithms

Sparse Graphs
  • For sparse graphs, both algorithms have similar time complexities, roughly O(V ext{log} V).

Dense Graphs
  • For dense graphs, Kruskal’s complexity becomes O(V^2 ext{log} V), while Prim’s with a binary heap remains around O(V^2 ext{log} V) and with a Fibonacci heap simplifies to O(V^2).


Handling Negative Weights

  • Both Prim's and Kruskal’s algorithms function correctly even with negative weights since the proofs of their correctness do not assume positive weights.


Finding Maximum Spanning Tree

  • To find a maximum spanning tree, choose the heaviest edges in the cuts. Alternatively, multiply all weights by -1 and apply either Prim’s or Kruskal’s algorithms without modification to find the maximum tree.