CoEng3122 - Algorithm Analysis and Design - Algorithm for Graph Problems
Algorithm Analysis and Design - Graph Problems
Chapter 5: Algorithms for Graph Problems
- Depth-first search
- Connected Components
- Topological sort
- Shortest path
- Sets of strings
- Search trees, string sorting, binary search
- Exact string matching - Finding a pattern (string) in a text (string)
- Approximate string matching - Finding in the text something that is similar to the pattern
Problem-Solving Agent
- Goal formulation: The agent adopts the goal of reaching the destination, organizing behavior by limiting objectives and actions.
- Problem formulation: The agent creates a description of the states and actions needed to reach the goal, forming an abstract model of the relevant world.
- Search: The agent simulates sequences of actions in its model to find a sequence that reaches the goal.
- Solution: A sequence of actions that reaches the goal.
- Execution: The agent executes the actions in the solution one at a time.
Requirements of Search Algorithms/Techniques
- It should cause motion.
- It should be systematic.
Search Types
1. Uninformed Search (Blind Search)
- Also called blind, exhaustive, or brute-force search.
- Uses no information about the problem to guide the search.
- May not be very efficient.
2. Informed Search (Heuristic Search)
- Also called heuristic or intelligent search.
- Uses information about the problem to guide the search.
- Guesses the distance to a goal state.
- More efficient but may not always be possible.
Search Techniques / Strategies
Uninformed Search:
- Depth-First Search (DFS)
- Breadth-First Search (BFS)
- Cost-First Search
Informed Search:
- Generate and Test
- Hill Climbing
- Best First Search
- Problem Reduction
- Constraint Satisfaction
- Means-end Analysis
- A* Search
- A0* Search
Types of Graphs
- Directed graph
- Undirected graph
- Weighted graph
- Cyclic graph
- Acyclic graph
- Bipartite graph
Uninformed/Blind Search
- Does not contain any domain knowledge (e.g., closeness, location of the goal).
- Operates in a brute-force way.
- Includes information about how to traverse the tree and identify leaf and goal nodes.
- Searches without information about the search space (initial state operators and test for the goal).
- Examines each node of the tree until the goal node is reached.
Informed Search
- Uses domain knowledge to guide the search.
- Can find a solution more efficiently than uninformed search.
- Also called a Heuristic search.
- A heuristic might not always guarantee the best solution but finds a good solution in a reasonable time.
- Can solve complex problems that might not be solvable otherwise.
Graph Definitions
- Two vertices are adjacent if they are joined by an edge.
- Vertex w is adjacent to v iff .
- In an undirected graph with edge , w is adjacent to v, and v is adjacent to w.
- A path between two vertices is a sequence of edges that begins at one vertex and ends at another vertex.
- is a path if for .
- A simple path passes through a vertex only once.
- A cycle is a path that begins and ends at the same vertex.
- A simple cycle is a cycle that does not pass through other vertices more than once.
Graph Example
A graph (undirected) with 5 vertices and 6 edges:
- Adjacent: 1 and 2 are adjacent (1 is adjacent to 2, and 2 is adjacent to 1).
- Path: 1, 2, 5 (a simple path), 1, 3, 4, 1, 2, 5 (a path but not a simple path).
- Cycle: 1, 3, 4, 1 (a simple cycle), 1, 3, 4, 1, 4, 1 (cycle, but not a simple cycle).
More Graph Definitions
- A connected graph has a path between each pair of distinct vertices.
- A complete graph has an edge between each pair of distinct vertices.
- A complete graph is also a connected graph, but a connected graph may not be a complete graph.
Directed Graphs (Digraphs)
- If the edge pair is ordered then the graph is called a directed graph
- Each edge has a direction and is called a directed edge.
- Definitions for undirected graphs apply to directed graphs with adjustments for direction.
- Vertex w is adjacent to v iff (direct edge from v to w).
- w is a successor of v.
- v is a predecessor of w.
- A directed path between two vertices is a sequence of directed edges that begins at one vertex and ends at another vertex.
- is a path if for .
Directed Graphs - Cycles and Connectivity
- A cycle in a directed graph is a path of length at least 1 such that . This cycle is simple if the path is simple. For undirected graphs, the edges must be distinct.
- A directed acyclic graph (DAG) is a directed graph with no cycles.
- An undirected graph is connected if there is a path from every vertex to every other vertex.
- A directed graph with this property is called strongly connected.
- If a directed graph is not strongly connected, but the underlying graph (without direction to arcs) is connected, then the graph is weakly connected.
Directed Graph Example
A graph with 5 vertices and 6 edges:
- Adjacent: 2 is adjacent to 1, but 1 is NOT adjacent to 2.
- Path: 1, 2, 5 (a directed path).
- Cycle: 1, 4, 3, 1 (a directed cycle).
Weighted Graph
- Edges are labeled with numeric values.
Graph Implementations
- Adjacency Matrix: A two-dimensional array.
- Adjacency List: For each vertex, a list of adjacent vertices.
Graph Terminology Summary
- Graph:
- Vertex set:
- Edge set: (pairs of adjacent vertices)
Directed and Undirected Graphs
- G directed: Edges are ordered pairs .
- G undirected: Edges are unordered pairs .
Proper Graphs and Subgraphs
- Proper graph: No loops, no multi-edges.
- Subgraph of : , where is a subset of and is a subset of between vertices of .
Paths in Graphs
- Path : A sequence of vertices , where for , is adjacent to .
- Equivalently, is a sequence of edges , where for , each consecutive pair of edges share a vertex.
Simple Paths and Cycles
- Simple path: No edge or vertex repeated, except possibly .
- Cycle: A path with , where k > 1.
Connected Undirected Graph
- is connected if there is a path between each pair of vertices.
Connected Components of an Undirected Graph
- Else has k > 1 maximal connected subgraphs = connected components
Biconnected Component
- Sets of edges where:
- Is a single edge.
- There are at least two disjoint paths between each pair of vertices.
Size of a Graph
- Graph
- = # vertices
- = # edges
- Size of is
Representing a Graph by an Adjacency Matrix
- Adjacency Matrix is size
- if , else.
- Space cost .
Adjacency List Representation of a Graph
- Adjacency Lists:
- = list of vertices adjacent to
- Space cost:
Definition of an Undirected Tree
- Tree : Graph with a unique simple path between every pair of vertices.
- = # vertices, = # edges
- Forest: Set of trees.
Definition of a Directed Tree
- Directed Tree : Digraph with distinguished vertex root such that each vertex is reachable from by a unique path.
- Family Relationships:
- Ancestors
- Descendants
- Parent
- Child
- Siblings
- Leaves have no proper descendants.
An Ordered Tree
- Ordered Tree: Is a directed tree with siblings ordered
Preorder Tree Traversal
- Preorder: A, B, C, D, E, F, H, I
- Root (order vertices as pushed on stack)
- Preorder left subtree
- Preorder right subtree
Postorder Tree Traversal
- Postorder: B, E, D, H, I, F, C, A
- Postorder left subtree
- Postorder right subtree
- Root (order vertices as popped off stack)
Spanning Tree and Forest of a Graph
- is a spanning tree of graph if
- is a directed tree with the same vertex set as .
- Each edge of is a directed version of an edge of .
- Spanning Forest: Forest of spanning trees of connected components of .
Example Spanning Tree of a Graph
- Classifications of Edges of with Spanning Tree
- An edge of is a tree edge.
- An edge of is a back edge if is a descendant or ancestor of .
- Else is a cross edge (do not exist in undirected DFS).
Adjacency Matrix
- An adjacency matrix for a graph with vertices numbered is an array matrix such that matrix is 1 (true) if there is an edge from vertex to vertex , and 0 (false) otherwise.
- When the graph is weighted, we can let matrix be the weight that labels the edge from vertex to vertex , instead of simply 1, and let matrix equal to instead of 0 when there is no edge from vertex to vertex .
- Adjacency matrix for an undirected graph is symmetrical, i.e., matrix is equal to matrix.
- Space requirement O().
- Acceptable if the graph is dense.
Adjacency Matrix – Examples
- Directed Graph Adjacency Matrix
- Undirected Weighted Graph Adjacency Matrix
Adjacency List
- An adjacency list for a graph with vertices numbered consists of linked lists. The linked list has a node for vertex if and only if the graph contains an edge from vertex to vertex .
- Adjacency list is a better solution if the graph is sparse.
- Space requirement is O(), which is linear in the size of the graph.
- In an undirected graph each edge appears in two lists, thus space requirement is doubled.
Adjacency List – Examples
- Directed Graph Adjacency List
- Undirected Weighted Graph Adjacency List
Adjacency Matrix vs Adjacency List
Two common graph operations:
- Determine whether there is an edge from vertex to vertex .
- Find all vertices adjacent to a given vertex .
- An adjacency matrix supports operation 1 more efficiently.
- An adjacency list supports operation 2 more efficiently.
- An adjacency list often requires less space than an adjacency matrix.
- Adjacency Matrix: Space requirement is O()
- Adjacency List: Space requirement is O(), which is linear in the size of the graph.
- Adjacency matrix is better if the graph is dense (too many edges).
- Adjacency list is better if the graph is sparse (few edges).
Graph Traversals
- A graph-traversal algorithm starts from a vertex , visits all of the vertices that can be reachable from the vertex .
- A graph-traversal algorithm visits all vertices if and only if the graph is connected.
- A connected component is the subset of vertices visited during a traversal algorithm that begins at a given vertex.
- A graph-traversal algorithm must mark each vertex during a visit and must never visit a vertex more than once.
- Thus, if a graph contains a cycle, the graph-traversal algorithm can avoid an infinite loop.
- We look at two graph-traversal algorithms:
Depth-First Search (aka Depth-First Traversal)
Breadth-First Search (aka Breadth-First Traversal)
Depth-First Search
- For a given vertex , the depth-first search algorithm proceeds along a path from as deeply into the graph as possible before backing up.
- That is, after visiting a vertex , the depth-first search algorithm visits (if possible) an unvisited adjacent vertex to vertex .
- The depth-first traversal algorithm does not completely specify the order in which it should visit the vertices adjacent to .
- We may visit the vertices adjacent to in sorted order.
Depth-First Search – Example
- A depth-first search of the graph starts from vertex .
- Visit a vertex, then visit a vertex adjacent to that vertex.
- If there is no unvisited vertex adjacent to a visited vertex, back up to the previous step.
Depth-First Search Algorithm
- Depth-first search is a recursive algorithm for traversing a tree or graph data structure.
- It is called depth-first search because it starts from the root node and follows each path to its greatest depth node before moving to the next path.
- DFS uses a stack data structure for its implementation.
- The process of the DFS algorithm is similar to the BFS algorithm.
Depth-First Search Algorithm Steps
- If the initial state is a goal state, quit and return success.
- Otherwise, do the following until success or failure is signaled:
- Generate a successor, , of the initial state. If there are no more successors, signal failure.
- Call Depth-First Search with as the initial state.
- If success is returned, signal success. Otherwise, continue in this loop.
Depth-First Search Example Steps
- Illustrative steps of the DFS algorithm with a sample graph, showing node expansion and the evolution of the NODE-LIST.
Depth-First Search Algorithm - Analysis
- Completeness: Complete within a finite state space as it will expand every node within a limited search tree.
- Time Complexity: Will be equivalent to the node traversed by the algorithm. , where = maximum depth of any node and this can be much larger than (Shallowest solution depth).
- Space Complexity: Needs to store only a single path from the root node, hence the space complexity of DFS is equivalent to the size of the fringe set, which is O().
- Optimal: Non-optimal, as it may generate a large number of steps or high cost to reach the goal node.
Depth-First Search Algorithm - Advantages and Disadvantages
Advantages:
- Requires less memory since only the nodes on the current path are stored.
- May find a solution without examining much of the search space at all.
Disadvantages:
- May find a sub-optimal solution (one that is deeper or more costly than the best solution).
- Incomplete: Without a depth bound, one may not find a solution even if one exists.
Tarjan’s Depth First Search Algorithm
- We assume a Random Access Machine (RAM) computational model
- Algorithm Depth First Search
- Input graph represented by adjacency lists for each
- for all do (number; children)
- for all do if number then DFS
- spanning forest defined by children if then output
- Input graph represented by adjacency lists for each
Recursive DFS Procedure
- The preorder numbers give the order the vertices are first visited in DFS.
- DFS
- ; number
- for each do if number then (add to children; DFS) recursive procedure
Time Cost of Depth-First Search (DFS) Algorithm on a RAM
- Input size
- Theorem: Depth-First Search takes linear time cost O()
- Proof: Can associate with each edge and vertex a constant number of operations.
Classification of Edges of via DFS Spanning Tree
Edge notation induced by DFS Tree
- iff is a tree edge of
- iff is an ancestor of
- iff is a backedge if with either
Search Strategies - Breadth-First Search Algorithm
- The most common search strategy for traversing a tree or graph. This algorithm searches breadthwise in a tree or graph, so it is called breadth-first search.
- BFS starts searching from the root node of the tree and expands all successor nodes at the current level before moving to nodes of next level.
- An example of a general-graph search algorithm.
- Implemented using FIFO queue data structure.
Breadth-First Search Algorithm Steps
Create a variable called NODE-LIST and set it to the initial state.
Until a goal state is found, or NODE-LIST is empty:
Remove the first element from NODE-LIST and call it E. If NODE-LIST was empty, then quit.
For element E do the following:
- Apply the rule to generate a new state
- If the new state is a goal state, quit and return this state
- Otherwise, add the new state to the end of NODE-LIST
Breadth-First Search Algorithm Function
function BREADTH-FIRST-SEARCH(problem) returns a solution node or failure
node = NODE(problem.INITIAL)
if problem.IS-GOAL(node.STATE) then return node
frontier = a FIFO queue, with node as an element
reached = {problem.INITIAL}
while not IS-EMPTY(frontier) do
node = POP(frontier)
for each child in EXPAND(problem, node) do
s = child.STATE
if problem.IS-GOAL(s) then return child
if s is not in reached then
add s to reached
add child to frontier
return failure
function UNIFORM-COST-SEARCH(problem) returns a solution node, or failure
return BEST-FIRST-SEARCH(problem, PATH-COST)
Breadth-First Search Example Steps
- Illustrative steps of the BFS algorithm with a sample graph, showing node expansion and the evolution of the NODE-LIST.
Breadth-First Search Algorithm - Advantages and Disadvantages
Advantages:
- One of the simplest search strategies.
- BFS is Complete. If there is a solution, BFS is guaranteed to find it.
- If there are multiple solutions, then a minimal solution will be found.
Disadvantages:
- The breadth-first search algorithm cannot be effectively used unless the search space is quite small.
Breadth-First Search Algorithm - Analysis
- Time Complexity: Can be obtained by the number of nodes traversed in BFS until the shallowest Node. Where the d = depth of the shallowest solution and b is a node at every state. T(b) =
- Space Complexity: Given by the Memory size of frontier which is O()
- Completeness: Complete, which means if the shallowest goal node is at some finite depth, then BFS will find a solution.
- Optimality: Optimal if path cost is a non-decreasing function of the depth of the node.