CoEng3122 - Algorithm Analysis and Design - Algorithm for Graph Problems

Algorithm Analysis and Design - Graph Problems

Chapter 5: Algorithms for Graph Problems

  • Depth-first search
  • Connected Components
  • Topological sort
  • Shortest path
  • Sets of strings
  • Search trees, string sorting, binary search
  • Exact string matching - Finding a pattern (string) in a text (string)
  • Approximate string matching - Finding in the text something that is similar to the pattern

Problem-Solving Agent

  • Goal formulation: The agent adopts the goal of reaching the destination, organizing behavior by limiting objectives and actions.
  • Problem formulation: The agent creates a description of the states and actions needed to reach the goal, forming an abstract model of the relevant world.
  • Search: The agent simulates sequences of actions in its model to find a sequence that reaches the goal.
  • Solution: A sequence of actions that reaches the goal.
  • Execution: The agent executes the actions in the solution one at a time.

Requirements of Search Algorithms/Techniques

  1. It should cause motion.
  2. It should be systematic.

Search Types

1. Uninformed Search (Blind Search)

  • Also called blind, exhaustive, or brute-force search.
  • Uses no information about the problem to guide the search.
  • May not be very efficient.

2. Informed Search (Heuristic Search)

  • Also called heuristic or intelligent search.
  • Uses information about the problem to guide the search.
  • Guesses the distance to a goal state.
  • More efficient but may not always be possible.

Search Techniques / Strategies

  • Uninformed Search:

    • Depth-First Search (DFS)
    • Breadth-First Search (BFS)
    • Cost-First Search
  • Informed Search:

    • Generate and Test
    • Hill Climbing
    • Best First Search
    • Problem Reduction
    • Constraint Satisfaction
    • Means-end Analysis
    • A* Search
    • A0* Search

Types of Graphs

  • Directed graph
  • Undirected graph
  • Weighted graph
  • Cyclic graph
  • Acyclic graph
  • Bipartite graph

Uninformed/Blind Search

  • Does not contain any domain knowledge (e.g., closeness, location of the goal).
  • Operates in a brute-force way.
  • Includes information about how to traverse the tree and identify leaf and goal nodes.
  • Searches without information about the search space (initial state operators and test for the goal).
  • Examines each node of the tree until the goal node is reached.

Informed Search

  • Uses domain knowledge to guide the search.
  • Can find a solution more efficiently than uninformed search.
  • Also called a Heuristic search.
  • A heuristic might not always guarantee the best solution but finds a good solution in a reasonable time.
  • Can solve complex problems that might not be solvable otherwise.

Graph Definitions

  • Two vertices are adjacent if they are joined by an edge.
  • Vertex w is adjacent to v iff (v,w) \in E.
    • In an undirected graph with edge (v, w), w is adjacent to v, and v is adjacent to w.
  • A path between two vertices is a sequence of edges that begins at one vertex and ends at another vertex.
    • w1, w2, …, wN is a path if (wi, w_{i+1}) \in E for 1 \leq i \leq N-1.
  • A simple path passes through a vertex only once.
  • A cycle is a path that begins and ends at the same vertex.
  • A simple cycle is a cycle that does not pass through other vertices more than once.

Graph Example

A graph G (undirected) with 5 vertices and 6 edges:
V = {1, 2, 3, 4, 5}
E = {(1,2), (1,3), (1,4), (2,5), (3,4), (4,5), (2,1), (3,1), (4,1), (5,2), (4,3), (5,4)}

  • Adjacent: 1 and 2 are adjacent (1 is adjacent to 2, and 2 is adjacent to 1).
  • Path: 1, 2, 5 (a simple path), 1, 3, 4, 1, 2, 5 (a path but not a simple path).
  • Cycle: 1, 3, 4, 1 (a simple cycle), 1, 3, 4, 1, 4, 1 (cycle, but not a simple cycle).

More Graph Definitions

  • A connected graph has a path between each pair of distinct vertices.
  • A complete graph has an edge between each pair of distinct vertices.
    • A complete graph is also a connected graph, but a connected graph may not be a complete graph.

Directed Graphs (Digraphs)

  • If the edge pair is ordered then the graph is called a directed graph
  • Each edge has a direction and is called a directed edge.
  • Definitions for undirected graphs apply to directed graphs with adjustments for direction.
  • Vertex w is adjacent to v iff (v,w) \in E (direct edge from v to w).
    • w is a successor of v.
    • v is a predecessor of w.
  • A directed path between two vertices is a sequence of directed edges that begins at one vertex and ends at another vertex.
    • w1, w2, …, wN is a path if (wi, w_{i+1}) \in E for 1 \leq i \leq N-1.

Directed Graphs - Cycles and Connectivity

  • A cycle in a directed graph is a path of length at least 1 such that w1 = wN. This cycle is simple if the path is simple. For undirected graphs, the edges must be distinct.
  • A directed acyclic graph (DAG) is a directed graph with no cycles.
  • An undirected graph is connected if there is a path from every vertex to every other vertex.
  • A directed graph with this property is called strongly connected.
  • If a directed graph is not strongly connected, but the underlying graph (without direction to arcs) is connected, then the graph is weakly connected.

Directed Graph Example

A graph G = (V, E) with 5 vertices and 6 edges:
V = {1, 2, 3, 4, 5}
E = {(1,2), (1,4), (2,5), (4,5), (3,1), (4,3)}

  • Adjacent: 2 is adjacent to 1, but 1 is NOT adjacent to 2.
  • Path: 1, 2, 5 (a directed path).
  • Cycle: 1, 4, 3, 1 (a directed cycle).

Weighted Graph

  • Edges are labeled with numeric values.

Graph Implementations

  • Adjacency Matrix: A two-dimensional array.
  • Adjacency List: For each vertex, a list of adjacent vertices.

Graph Terminology Summary

  • Graph: G = (V, E)
  • Vertex set: V
  • Edge set: E (pairs of adjacent vertices)

Directed and Undirected Graphs

  • G directed: Edges are ordered pairs (u, v).
  • G undirected: Edges are unordered pairs{u, v} .

Proper Graphs and Subgraphs

  • Proper graph: No loops, no multi-edges.
  • Subgraph G' of G: G' = (V', E'), where V' is a subset of V and E' is a subset of E between vertices of V'.

Paths in Graphs

  • Path p: A sequence of vertices v0, v1, …, vk, where for i = 1, …, k, v{i-1} is adjacent to v_i.
  • Equivalently, p is a sequence of edges e1, …, ek, where for i = 2, …, k, each consecutive pair of edges e{i-1}, ei share a vertex.

Simple Paths and Cycles

  • Simple path: No edge or vertex repeated, except possibly v0 = vk.
  • Cycle: A path p with v0 = vk, where k > 1.

Connected Undirected Graph

  • G is connected if there is a path between each pair of vertices.

Connected Components of an Undirected Graph

  • Else G has k > 1 maximal connected subgraphs = connected components

Biconnected Component

  • Sets of edges where:
    • Is a single edge.
    • There are at least two disjoint paths between each pair of vertices.

Size of a Graph

  • Graph G = (V, E)
  • n = |V| = # vertices
  • m = |E| = # edges
  • Size of G is n + m

Representing a Graph by an Adjacency Matrix

  • Adjacency Matrix A is size n \times n
  • A[i, j] = 1 if (i, j) \in E, 0 else.
  • Space cost O(n^2).

Adjacency List Representation of a Graph

  • Adjacency Lists: Adj(1), …, Adj(n)
  • Adj(v) = list of vertices adjacent to v
  • Space cost: O(n + m)

Definition of an Undirected Tree

  • Tree T: Graph with a unique simple path between every pair of vertices.
  • n = # vertices, n-1 = # edges
  • Forest: Set of trees.

Definition of a Directed Tree

  • Directed Tree T: Digraph with distinguished vertex root r such that each vertex is reachable from r by a unique path.
  • Family Relationships:
    • Ancestors
    • Descendants
    • Parent
    • Child
    • Siblings
  • Leaves have no proper descendants.

An Ordered Tree

  • Ordered Tree: Is a directed tree with siblings ordered

Preorder Tree Traversal

  • Preorder: A, B, C, D, E, F, H, I
    1. Root (order vertices as pushed on stack)
    2. Preorder left subtree
    3. Preorder right subtree

Postorder Tree Traversal

  • Postorder: B, E, D, H, I, F, C, A
    1. Postorder left subtree
    2. Postorder right subtree
    3. Root (order vertices as popped off stack)

Spanning Tree and Forest of a Graph

  • T is a spanning tree of graph G if
    1. T is a directed tree with the same vertex set as G.
    2. Each edge of T is a directed version of an edge of G.
  • Spanning Forest: Forest of spanning trees of connected components of G.

Example Spanning Tree of a Graph

  • Classifications of Edges of G with Spanning Tree T
  • An edge (u, v) of T is a tree edge.
  • An edge (u, v) of G - T is a back edge if u is a descendant or ancestor of v.
  • Else (u, v) is a cross edge (do not exist in undirected DFS).

Adjacency Matrix

  • An adjacency matrix for a graph with n vertices numbered 0, 1, …, n-1 is an n \times n array matrix such that matrix[i][j] is 1 (true) if there is an edge from vertex i to vertex j, and 0 (false) otherwise.
  • When the graph is weighted, we can let matrix[i][j] be the weight that labels the edge from vertex i to vertex j, instead of simply 1, and let matrix[i][j] equal to \infty instead of 0 when there is no edge from vertex i to vertex j.
  • Adjacency matrix for an undirected graph is symmetrical, i.e., matrix[i][j] is equal to matrix[j][i].
  • Space requirement O(|V|^2).
  • Acceptable if the graph is dense.

Adjacency Matrix – Examples

  • Directed Graph Adjacency Matrix
  • Undirected Weighted Graph Adjacency Matrix

Adjacency List

  • An adjacency list for a graph with n vertices numbered 0, 1, …, n-1 consists of n linked lists. The i^{th} linked list has a node for vertex j if and only if the graph contains an edge from vertex i to vertex j.
  • Adjacency list is a better solution if the graph is sparse.
  • Space requirement is O(|E| + |V|), which is linear in the size of the graph.
  • In an undirected graph each edge (v, w) appears in two lists, thus space requirement is doubled.

Adjacency List – Examples

  • Directed Graph Adjacency List
  • Undirected Weighted Graph Adjacency List

Adjacency Matrix vs Adjacency List

Two common graph operations:

  1. Determine whether there is an edge from vertex i to vertex j.
  2. Find all vertices adjacent to a given vertex i.
  • An adjacency matrix supports operation 1 more efficiently.
  • An adjacency list supports operation 2 more efficiently.
  • An adjacency list often requires less space than an adjacency matrix.
    • Adjacency Matrix: Space requirement is O(|V|^2)
    • Adjacency List: Space requirement is O(|E| + |V|), which is linear in the size of the graph.
    • Adjacency matrix is better if the graph is dense (too many edges).
    • Adjacency list is better if the graph is sparse (few edges).

Graph Traversals

  • A graph-traversal algorithm starts from a vertex v, visits all of the vertices that can be reachable from the vertex v.
  • A graph-traversal algorithm visits all vertices if and only if the graph is connected.
  • A connected component is the subset of vertices visited during a traversal algorithm that begins at a given vertex.
  • A graph-traversal algorithm must mark each vertex during a visit and must never visit a vertex more than once.
  • Thus, if a graph contains a cycle, the graph-traversal algorithm can avoid an infinite loop.
  • We look at two graph-traversal algorithms:
    Depth-First Search (aka Depth-First Traversal)
    Breadth-First Search (aka Breadth-First Traversal)

Depth-First Search

  • For a given vertex v, the depth-first search algorithm proceeds along a path from v as deeply into the graph as possible before backing up.
  • That is, after visiting a vertex v, the depth-first search algorithm visits (if possible) an unvisited adjacent vertex to vertex v.
  • The depth-first traversal algorithm does not completely specify the order in which it should visit the vertices adjacent to v.
  • We may visit the vertices adjacent to v in sorted order.

Depth-First Search – Example

  • A depth-first search of the graph starts from vertex v.
  • Visit a vertex, then visit a vertex adjacent to that vertex.
  • If there is no unvisited vertex adjacent to a visited vertex, back up to the previous step.

Depth-First Search Algorithm

  • Depth-first search is a recursive algorithm for traversing a tree or graph data structure.
  • It is called depth-first search because it starts from the root node and follows each path to its greatest depth node before moving to the next path.
  • DFS uses a stack data structure for its implementation.
  • The process of the DFS algorithm is similar to the BFS algorithm.

Depth-First Search Algorithm Steps

  1. If the initial state is a goal state, quit and return success.
  2. Otherwise, do the following until success or failure is signaled:
    • Generate a successor, E, of the initial state. If there are no more successors, signal failure.
    • Call Depth-First Search with E as the initial state.
    • If success is returned, signal success. Otherwise, continue in this loop.

Depth-First Search Example Steps

  • Illustrative steps of the DFS algorithm with a sample graph, showing node expansion and the evolution of the NODE-LIST.

Depth-First Search Algorithm - Analysis

  • Completeness: Complete within a finite state space as it will expand every node within a limited search tree.
  • Time Complexity: Will be equivalent to the node traversed by the algorithm. T(n) = 1 + n^2 + n^3 + … + n^m = O(n^m), where m = maximum depth of any node and this can be much larger than d (Shallowest solution depth).
  • Space Complexity: Needs to store only a single path from the root node, hence the space complexity of DFS is equivalent to the size of the fringe set, which is O(b^m).
  • Optimal: Non-optimal, as it may generate a large number of steps or high cost to reach the goal node.

Depth-First Search Algorithm - Advantages and Disadvantages

Advantages:

  • Requires less memory since only the nodes on the current path are stored.
  • May find a solution without examining much of the search space at all.

Disadvantages:

  • May find a sub-optimal solution (one that is deeper or more costly than the best solution).
  • Incomplete: Without a depth bound, one may not find a solution even if one exists.

Tarjan’s Depth First Search Algorithm

  • We assume a Random Access Machine (RAM) computational model
  • Algorithm Depth First Search
    • Input graph G = (V,E) represented by adjacency lists Adj(v) for each v \in V
      1. N \leftarrow 0
      2. for all v \in V do (number(v) \leftarrow 0; children(v) \leftarrow () )
      3. for all v \in V do if number(v) = 0 then DFS(v)
      4. spanning forest defined by children if then output

Recursive DFS Procedure

  • The preorder numbers give the order the vertices are first visited in DFS.
  • DFS(v)
    1. N \leftarrow N + 1; number(v) \leftarrow N
    2. for each u \in Adj(v) do if number(u) = 0 then (add u to children(v); DFS(u)) recursive procedure

Time Cost of Depth-First Search (DFS) Algorithm on a RAM

  • Input size n = |V|, m = |E|
  • Theorem: Depth-First Search takes linear time cost O(n + m)
  • Proof: Can associate with each edge and vertex a constant number of operations.

Classification of Edges of G via DFS Spanning Tree T

Edge notation induced by DFS Tree T
  • u \rightarrow v iff (u, v) is a tree edge of T
  • u \rightarrow^* v iff u is an ancestor of v
  • u --- v iff (u, v) is a backedge if (u, v) \in G - T with either u \rightarrow^* v or v \rightarrow^* u

Search Strategies - Breadth-First Search Algorithm

  • The most common search strategy for traversing a tree or graph. This algorithm searches breadthwise in a tree or graph, so it is called breadth-first search.
  • BFS starts searching from the root node of the tree and expands all successor nodes at the current level before moving to nodes of next level.
  • An example of a general-graph search algorithm.
  • Implemented using FIFO queue data structure.

Breadth-First Search Algorithm Steps

Create a variable called NODE-LIST and set it to the initial state.
Until a goal state is found, or NODE-LIST is empty:

  • Remove the first element from NODE-LIST and call it E. If NODE-LIST was empty, then quit.

  • For element E do the following:

    • Apply the rule to generate a new state
    • If the new state is a goal state, quit and return this state
    • Otherwise, add the new state to the end of NODE-LIST

Breadth-First Search Algorithm Function

function BREADTH-FIRST-SEARCH(problem) returns a solution node or failure
 node = NODE(problem.INITIAL)
 if problem.IS-GOAL(node.STATE) then return node
 frontier = a FIFO queue, with node as an element
 reached = {problem.INITIAL}
 while not IS-EMPTY(frontier) do
 node = POP(frontier)
 for each child in EXPAND(problem, node) do
 s = child.STATE
 if problem.IS-GOAL(s) then return child
 if s is not in reached then
 add s to reached
 add child to frontier
 return failure

function UNIFORM-COST-SEARCH(problem) returns a solution node, or failure
 return BEST-FIRST-SEARCH(problem, PATH-COST)

Breadth-First Search Example Steps

  • Illustrative steps of the BFS algorithm with a sample graph, showing node expansion and the evolution of the NODE-LIST.

Breadth-First Search Algorithm - Advantages and Disadvantages

Advantages:

  • One of the simplest search strategies.
  • BFS is Complete. If there is a solution, BFS is guaranteed to find it.
  • If there are multiple solutions, then a minimal solution will be found.

Disadvantages:

  • The breadth-first search algorithm cannot be effectively used unless the search space is quite small.

Breadth-First Search Algorithm - Analysis

  • Time Complexity: Can be obtained by the number of nodes traversed in BFS until the shallowest Node. Where the d = depth of the shallowest solution and b is a node at every state. T(b) = 1 + b^2 + b^3 + … + b^d = O(b^d)
  • Space Complexity: Given by the Memory size of frontier which is O(b^d)
  • Completeness: Complete, which means if the shallowest goal node is at some finite depth, then BFS will find a solution.
  • Optimality: Optimal if path cost is a non-decreasing function of the depth of the node.