Distributed Computing Notes

Distributed Computing

Definition

A collection of autonomous computers linked via a computer network, equipped with distributed system software.
A method of making multiple computers work together to solve a common problem.
Distributed system software enables and coordinates all systems for sharing available resources between software and data.

Working of Distributed Computing Components

Devices or Systems: Have their own processing capabilities and may store/manage data.
Network: Connects devices, allowing communication and data exchange.
Resource Management: Allocates and manages shared resources (computing power, storage, networking).

Characteristics

Multiple Devices or Systems: Processing and data storage distributed.
Peer-to-Peer Architecture: Devices act as both clients and servers.
Shared Resources: Computing power, storage, and networking are shared.

Advantages

Scalability: Easily add new devices to increase capacity.
Reliability: Continues to operate even if a device fails.
Flexibility: Easily configured to meet changing needs.

Disadvantages

Complexity: Involves coordinating and managing multiple devices.
Security: Challenging to secure due to the need for security measures on each device.
Performance: May not match centralized systems due to distributed processing.

Applications

Cloud Computing: Delivers resources over the Internet.
Peer-to-Peer Networks: Shares resources among users.
Distributed Architectures: Distributes processing and storage across devices (e.g., microservices).

Relation to Parallel Multiprocessor/Multicomputer Systems

Both multiprocessor and multicomputer systems are used for parallel processing but differ in communication and memory handling.
Multiprocessors share a single memory space.
Multicomputers have separate processors and memories, communicating through message passing.

Multiprocessors

Shared Memory: Processors share a single address space, enabling faster communication.
Tightly Coupled: Connected through a bus for direct memory access.
Examples: Symmetrical Multiprocessing (SMP).
Use Cases: Applications requiring frequent communication.

Multicomputer

Distributed Memory: Each processor has its own memory, requiring message passing.
Loosely Coupled: Processors connected through a network with potential latency.
Examples: Clusters of independent computers.
Use Cases: Applications divided into independent tasks.

Key Differences

Feature	Multiprocessor	Multicomputer
Memory	Shared memory	Distributed memory
Communication	Direct memory access	Message passing
Coupling	Tightly coupled	Loosely coupled
Scalability	Limited	High
Cost	More expensive	Cost-effective

Distributed Computing and Parallel Processing

Distributed computing: Multiple computers work together, often using message passing.
Parallel computing: Multiple processors execute tasks concurrently.
Relationship: Multiprocessors for parallel processing within a single computer, multicomputers for distributed parallel processing across multiple computers.

MESSAGE-PASSING SYSTEMS VERSUS SHARED MEMORY SYSTEMS

Communications between tasks in multiprocessor systems occur through:

Message Passing Systems:

Allows multiple processes to read and write data to the message queue without being connected to each other.
Messages are stored on the queue until their recipient retrieves them.
- Message queues are quite useful for inter process communication and are used by most operating systems.
Advantages:
- Suitable for systems with physically distributed nodes.
- Easier to design for fault tolerance (failures in one process don't affect memory of others).
- Avoids issues like race conditions on shared data.
Disadvantages:
- Requires more overhead to handle message formatting, transmission, and synchronization.
- Slower due to network latency and protocol processing.

Shared Memory Systems:

The shared memory is the memory that can be simultaneously accessed by multiple processes. This is done so that the processes can communicate with each other.
- Communication among processors takes place through shared data variables, and control variables for synchronization among the processors.
Semaphores and monitors are common synchronization mechanisms on shared memory systems.
When shared memory model is implemented in a distributed environment, it is termed as distributed shared memory.
Advantages:
- Fast communication since it avoids network overhead.
- Suitable for high-speed data sharing.
Disadvantages:
- More complex to implement synchronization (e.g., using semaphores, mutexes).
- Susceptible to issues like race conditions, deadlocks, and inconsistent state if not carefully managed.

Inter-process communication models

Message Passing Model
Shared Memory Model

Differences between message passing and shared memory models

	Message Passing	Distributed Shared Memory
Marshaling	Variables have to be marshalled from one process, transmitted and unmarshalled into other variables at the receiving process.	The processes share variables directly, so no marshalling and unmarshalling. Shared variables can be named, stored and accessed in DSM.
Address Space	Processes can communicate with other processes. They can be protected from one another by having private address spaces.	Here, a process does not have private address space. So one process can alter the execution of the other.
Heterogeneous	This technique can be used in heterogeneous computers.	This cannot be used to heterogeneous computers.
Synchronization	Synchronization between processes is through message passing primitives.	Synchronization is through locks and semaphores.
Execution Time	Processes communicating via message passing must execute at the same time.	Processes communicating through DSM may execute with non-overlapping lifetimes.
Efficiency Awareness	All remote data accesses are explicit and therefore the programmer is always aware of whether a particular operation is in-process or involves the expense of communication.	Any particular read or update may or may not involve communication by the underlying runtime support.

Primitives for Distributed Communication

Communication primitives are fundamental operations that processes or nodes use to exchange information.

1. Send and Receive

Basic primitives used in most low-level distributed systems.
send(destination, message): Sends a message to a specified destination.
receive(source, message): Receives a message from a specified source.
These primitives can be blocking (synchronous) or non-blocking (asynchronous):
- Blocking send/receive: The process waits until the operation is complete.
- Non-blocking send/receive: The process continues execution without waiting.

2. Remote Procedure Call (RPC)

Abstracts communication as if calling a local function.
Allows a process on one machine to call a procedure on another machine.
Handles message passing, marshalling, and unmarshalling.
Variants:
- Synchronous RPC: Caller waits for the reply.
- Asynchronous RPC: Caller continues and checks for response later.

3. Message Queues

Messages are sent to and retrieved from a queue.
Enables decoupling of sender and receiver in time and space.
Useful in fault-tolerant and scalable systems.
Examples: ZeroMQ, RabbitMQ, Apache Kafka.

4. Multicast and Broadcast

Multicast: Send message to a group of recipients.
Broadcast: Send message to all nodes in the network.
Used in group communication, synchronization, and distributed algorithms.

5. Sockets

Endpoints for sending and receiving data across the network.
Types:
- TCP sockets (connection-oriented)
- UDP sockets (connectionless)
Used in many custom protocols and applications.

6. Shared Memory (in distributed systems via abstraction)

Emulated over a network for efficiency in certain systems.
Used in Distributed Shared Memory (DSM) systems.

7. Publish/Subscribe (Pub/Sub)

Processes publish messages to topics.
Other processes subscribe to topics of interest.
Decouples producers and consumers.
Common in event-driven and real-time systems.

Synchronous and Asynchronous Execution

1. Synchronous Execution

What it means: The sender waits until the task is done before moving on.
How it works: One process blocks (stops) until it gets a reply from the other.
Example: A client sends a request to a server and waits for a response.
Pros
- Easy to understand and use.
- Easy to find and fix errors.
Cons:
- Slower due to waiting time.
- Hard to handle multiple tasks at once (less concurrency).
- Can cause deadlocks if one side never replies.

2. Asynchronous Execution

What it means: The sender doesn’t wait. It continues working and checks for the result later.
How it works: One process sends a message and goes on. It may get the result through a callback or event later.
Example: A client sends a message to a server and keeps doing other work. It receives the reply later.
Pros:
- Faster and can do more tasks at the same time.
- Better use of system resources.
- Works better in large or slow networks.
Cons:
- Harder to write and understand.
- Needs extra code to handle replies and errors.

Comparison Table:

Aspect	Synchronous	Asynchronous
Waits for result	Yes	No
Blocking	Yes	No
Task connection	Tightly connected	Loosely connected
Easy to code?	Yes	No (more complex)
Speed/Performance	Slower in high load	Faster and scalable
Used in	RPC, client-server	Event systems, message queues

Multiprocessor Architecture:

Multiple processors that work together and are connected to each other.
They have a shared memory which means they will make changes in the same memory and every processor can access that memory.
All of the processors are connected to memory by topology.

Multicomputer Architecture:

Multiple processors are linked to work together like in multiprocessor.
Now each node have its own memory connected to it.
When multiple processors with there own memory are connected by a network it is known as multicomputer.
In this shared memory is not required. They can communicate with each other with the help of connection.

What is Distributed Graph Algorithm?

A graph algorithm where the nodes (vertices) of the graph are individual computers or processes, and they communicate with each other via message passing to solve a problem (e.g., building a spanning tree, finding shortest paths, etc.).

How distributed graph algorithms work?

Distributed graph algorithms operate by dividing a graph into smaller subgraphs and processing these subgraphs on multiple machines simultaneously.

Basic Program Structure of Distributed Graph Algorithms

Each node runs the same algorithm, but behaves differently based on:

Its unique ID
Its local knowledge
The messages it receives from neighbors

Main Components of the Program Structure

Component	Description
Process/Node	A computer or program running at each graph vertex.
State Variables	Each process keeps local variables (e.g., parent, visited, level).
Initialization	At the start, each node sets up its state. Some nodes (e.g., initiators) start the algorithm.
Message Receiving	Nodes wait to receive messages from neighbors.
Message Processing	On receiving a message, nodes update state and send new messages.
Termination	The algorithm ends when a stopping condition is met (e.g., no more messages, tree built).

Typical Structure (Pseudocode-Like)

Start:
 Initialize local variables
Loop:
 Wait for message
 If message received:
 Process message
 Update local state
 Possibly send new messages to neighbors
 If termination condition met:
 Halt

Key Features

Decentralized: No single node controls the entire graph.
Concurrent: Multiple nodes can act at the same time.
Event-driven: Actions depend on messages received.
Local decision-making: Each node decides based on its local state and received messages.

Synchronous/Asynchronous single initiator spanning tree algorithm using flooding

What is a Spanning Tree?

A spanning tree of a graph is a tree that:

Includes all the nodes of the graph.
Has no cycles.
Has exactly $n-1$ edges if there are $n$ nodes.

What is Single-Initiator Flooding?

Single-Initiator: Only one node starts the process (called the initiator).
Flooding: The initiator sends messages to its neighbors, and the messages propagate throughout the network to form a tree.

1. Synchronous Single-Initiator Flooding Algorithm

Meaning of Synchronous
- All nodes work in synchronized rounds (like a clock tick).
- In each round, nodes send, then receive, then process messages.
Steps:
1. Round 0:
  - Initiator sends Flood message to all neighbors.
2. Round 1:
  - Neighbors receive the message, mark the sender as parent, and forward Flood to their other neighbors (except parent).
3. Next Rounds:
  - This continues until all nodes have received the message and selected a parent.
4. Tree is formed: Each node knows its parent, forming a spanning tree.
Features:
- Easy to analyze.
- Requires global clock or synchronization.
- Works in fixed rounds.

Asynchronous Single-Initiator Flooding Algorithm

Meaning of Asynchronous
- No global clock.
- Nodes act as soon as they receive a message (message passing is event-driven).
Steps:
1. Initiator sends Flood message to all neighbors.
2. When a node receives Flood for the first time:
  - It marks the sender as its parent.
  - Then sends Flood to its other neighbors.
3. If a node receives a Flood message again, it ignores it (since it already has a parent).
4. Eventually, all nodes are visited, and a spanning tree is formed.
Features:
- No need for synchronized time.
- May take varying time depending on message delays.
- More realistic for distributed systems.

Asynchronous concurrent initiator spanning tree algorithm using flooding:

What is the Goal?

To build one or more spanning trees in a distributed network where multiple nodes (called concurrent initiators) may start the flooding at the same time, and the network operates asynchronously.

What is Asynchronous Concurrent-Initiator Flooding?

Asynchronous: No global clock; each node reacts immediately upon receiving a message.
Concurrent-Initiator: Multiple initiator nodes start the algorithm simultaneously.
Flooding: Each initiator tries to build its own tree by sending messages.

How the Algorithm Works Step-by-Step:

Multiple Initiators Start Flooding:
- Each initiator sends a Flood message with its own ID (or unique identifier) to all neighbors.
Receiving a Flood Message:
- When a node receives the first Flood(ID) message, it:
  - Joins that initiator's tree (adopts that initiator’s ID).
  - Sets the sender as its parent.
  - Forwards Flood(ID) to all other neighbors except parent.
Competing Initiators:
- If a node later receives another Flood(ID') from a different initiator:
  - It compares IDs:
    - If new ID is lower, it switches to the new tree:
      - Changes its parent.
      - Changes its tree ID.
      - Resends Flood(new ID) to neighbors.
    - If new ID is higher, it ignores the message.
Tree Formation:
- This process continues until every node belongs to the tree of the initiator with the smallest ID.

Result:

The network eventually forms a single spanning tree rooted at the initiator with the smallest ID.
Other initiators' trees are abandoned as nodes switch to the tree with the smallest ID.

Asynchronous concurrent-initiator depth first search spanning tree algorithm

This algorithm builds a spanning tree using Depth-First Search (DFS) in a distributed system, where:

Multiple nodes may initiate the algorithm at the same time (concurrent initiators).
The system is asynchronous — nodes act whenever they receive messages (no global clock).
Each initiator tries to construct its own DFS tree, but eventually, only one spanning tree (rooted at the lowest-ID initiator) is accepted.

DFS in Distributed Computing

DFS means:

Explore deeply along one path before backtracking.
Each node visits an unvisited neighbor and recursively explores.

Algorithm Description

Message Types:
- DFS_Explore(ID) → sent to explore a neighbor.
- DFS_Ack(ID) → sent when all children explored and node is done.
- DFS_Reject(ID) → sent when a node is already part of a different tree.
State at Each Node:
- visited → whether it has been visited or not.
- parent → from whom it first received the message.
- initiator_ID → ID of the initiator it's currently part of.

Algorithm Steps

Step 1: Initiation (multiple initiators)

Each initiator sends DFSExplore(itsID) to one neighbor at a time (like standard DFS).

Step 2: On Receiving DFS_Explore(ID)

If not yet visited:
- Mark as visited.
- Set sender as parent.
- Adopt ID as tree's root.
- Send DFS_Explore(ID) to one unvisited neighbor (recursive DFS style).
If already visited with smaller ID:
- Send DFS_Reject(ID) — do not switch.
If already visited with larger ID:
- Switch to new (smaller) tree:
  - Change parent and initiator ID.
  - Start DFS again from this node with new ID.

Step 3: When No More Unvisited Neighbors

Send DFS_Ack(ID) to parent to signal completion.
When initiator gets back DFS_Ack() from all paths, the DFS is complete.

Minimum-Weight Spanning Tree (MST) Algorithm in a Synchronous System

Goal

To compute a spanning tree with the minimum total edge weight in a distributed graph, where each node only has local knowledge (its neighbors and the weights of connecting edges).

Assumptions

The system is synchronous: communication and computation occur in discrete rounds.
Each node (process) has a unique ID.
Each node knows the weights of its adjacent edges.
Communication is by message-passing.
The graph is connected, undirected, and weighted.

Common Algorithm: Gallager-Humblet-Spira (GHS) (adapted for synchronous systems)

The GHS algorithm is the most well-known distributed MST algorithm, though it was originally designed for asynchronous systems. In synchronous systems, the operation is easier to coordinate due to known timing.

Basic Concepts of GHS (in synchronous model)

The network is divided into fragments, which are subtrees of the final MST.
Each fragment has a unique fragment ID and a level.
Nodes in a fragment cooperate to:
1. Find the Minimum Weight Outgoing Edge (MWOE) to another fragment.
2. Use that edge to merge with the neighboring fragment.
3. Repeat until all nodes are in a single fragment → MST complete.

High-Level Steps (Synchronous Rounds)

Step 1: Initialization

Each node starts as a separate fragment (singleton).
Each fragment selects its local MWOE.

Step 2: MWOE Finding (within each fragment)

Nodes broadcast to neighbors to discover the lightest edge connecting to another fragment.
In synchronous systems, this takes $O(1)$ round per hop.

Step 3: Fragment Merge

The lowest MWOE becomes part of the MST.
Connected fragments merge into a larger one.
Fragment ID and level are updated.

Step 4: Repeat

Continue MWOE search and merging in synchronous rounds.
Each iteration doubles the fragment size.

Time Complexity (Synchronous)

Number of Phases: $O(log N)$ , where N is the number of nodes.
Each Phase: Takes $O(N)$ rounds for MWOE finding and merging.
Total Rounds: $O(N log N)$

Message Complexity

$O(E + N log N)$ , where E is the number of edges.

Final Output

All nodes agree on the same MST, and every edge in the MST connects nodes from different fragments via MWOEs.