Study Notes for Graphs and Trees; Web-page Ranking - CS 505 and CS 506
Graphs and Trees; Web-page ranking CS 505 and CS 506 - Spring 2025 Notes
Directed Graphs
A directed graph (digraph) is defined as a pair G = (V, E) where:
V: finite set of vertices (denoted by n = |V|)
E: binary relation on V (E ⊆ V × V), the set of directed edges.
Vertices and Edges:
Vertices are denoted by elements in V, typically written as lowercase letters (e.g., u, v).
An edge from u to v is denoted (u, v), indicating direction from u to v.
An edge is incident from u (outgoing) and incident to v (incoming).
3.1.1 Edges
An edge (u, v) leaves vertex u and enters vertex v.
Self-loops (u, u) are possible in digraphs but multiple edges between two vertices are not allowed.
3.1.3 Properties of Directed Graphs
In-degree (i(u)): Number of edges incident to vertex u.
Out-degree (o(u)): Number of edges incident from vertex u.
Degree (d(u)): The sum of both in-degree and out-degree.
Formula: ( \sum_{u \in V} d(u) = 2m ) (total edges)
Show: ( \sum_{u \in V} o(u) = m )
Undirected Graphs
An undirected graph G is defined as G = (V, E), where E contains unordered pairs of vertices.
3.2.1 Edges
An edge (u, v) means {u, v} in undirected graphs (order irrelevant).
Two vertices connected by an edge are adjacent.
3.2.3 Degree
Each vertex's degree (d(u)) is the number of edges incident on it.
Other Types of Graphs
Bipartite Graphs: A graph with vertices that can be divided into two sets with edges only between nodes of different sets.
Subgraphs: A graph G1 is a subgraph of G2 if V1 ⊆ V2 and E1 ⊆ E2.
Complete Graphs: Denoted as K_n, where every two vertices are adjacent.
Isomorphic Graphs: Two graphs that can be mapped to one another via a bijective function between their vertex sets.
Paths, Tours, and Cycles
3.4.1 Paths
A path from u to v (u❀v) is a sequence of vertices connected by edges.
Defined as ( ⟨u0, u1, \dots, uk⟩ ) with (ui, u_{i+1}) ∈ E.
3.4.4 Cycles
A cycle occurs if a path ends where it starts, defined as ( ⟨u0, u1, …, uk, u0⟩ ).
Self-loops are disallowed in undirected cycles.
Graph Connectivity
3.5.1 Connected Graphs
A graph is connected if there exists a path between every pair of vertices.
Connected Components: Subsets of vertices that are mutually reachable.
3.5.3 Strongly Connected Graphs
A directed graph is strongly connected if every vertex can be reached from every other vertex.
Trees and Forests
3.6 Definitions
A tree is a connected acyclic undirected graph.
A forest is a collection of trees.
3.6.4 Properties of Trees
G is a free tree.
Any two vertices connected by a unique single path.
G is connected, and any edge removed disconnects it.
G is acyclic with |E| = |V| - 1.
Rooted Trees
Rooted Tree: A tree with a distinguished root vertex.
Internal Nodes: Have children. Leaves: No children.
Depth and Height
Depth: Length from the root to a node.
Height: Largest depth across all nodes in the tree.
Graph Representations
Adjacency Matrix: n x n array representing connections, allowing O(1) time complexity for edge lookups. However, space complexity is O(n²).
Adjacency List: An array of lists where each list corresponds to a vertex's adjacent vertices, allowing O(d) time complexity for lookups, where d is the degree of the vertex, and space complexity is O(m + n) where m is the edges and n is the vertices.
Sparse vs Dense Graphs: Dense if m ≫ n, sparse otherwise.
Tree Traversals
Preorder, Postorder, and Inorder traversals in binary trees with a time complexity of O(n).
Breadth-first and Depth-first traversal techniques extended to graphs also with a time complexity of O(n + m).
Union-Find Data Structure
Maintains collections of disjoint sets.
Operations
MakeSet(e): Create a set with element e; time complexity is O(1).
Find(e): Find which set e belongs to; with path compression, it is nearly O(1).
Union(e1, e2): Merge sets containing e1 and e2; the time complexity is nearly O(1) with union by rank.
Performance
Efficient for connectivity queries in undirected graphs, operations are effectively constant time due to optimizations.
Document Processing and Web Search
Web-based ranking algorithms: Includes HITS and PageRank.
HITS: Assigns authority and hub scores based on link structures, iterating through AJ and AL matrices; complexity is O(n(m + n)).
PageRank: Evaluates importance based on the web graph's structure with teleportation for disconnected graphs; complexity varies but is generally O(kn), where k is the number of iterations.
Both algorithms aim to rank web pages for relevance in search queries.
Important Algorithms:
HITS: Computes scores iteratively based on AJ and AL matrices.
PageRank: Incorporates randomness through a teleportation mechanism to ensure all pages are reached at least some probability.
Problem Sets Overview
Problems focus on graph modeling and applying theoretical concepts to practical scenarios, including tree traversals, connectivity checks, and exploring directed/undirected graphs.