Lecture 7

Synchronization in Distributed Systems

Definition: A distributed system comprises a collection of computers connected via a high-speed communication network.
Communication: Hardware and software components in such systems communicate and coordinate their actions through message passing.
Resource Sharing: Each node can share its resources, necessitating proper resource allocation to maintain resource states and process coordination.
Synchronization: Essential for resolving conflicts and ensuring coordinated process execution in distributed systems.

External Clock Synchronization: Uses an external reference clock to adjust the time of nodes.
Internal Clock Synchronization: Nodes share their local time to adjust each other’s time.

Centralized Algorithms:
- A single time server propagates time to nodes.
- Examples: Berkeley Algorithm, Passive Time Server, Active Time Server.
- Drawback: A single point of failure - if the server fails, synchronization fails.
Distributed Algorithms:
- No centralized server; nodes average their local time.
- Advantages: Overcomes scalability issues and single points of failure.
- Examples: Global Averaging Algorithm, Localized Averaging Algorithm, NTP (Network Time Protocol).

Importance: Ensures coordinated access to shared resources and inter-process communication.

Shared Memory Synchronization
- Ensures nodes access shared data without inconsistencies.
- Methods:
  - Semaphore: A shared integer variable with operations for acquiring and releasing locks.
  - Monitor: High-level data type abstraction for managing shared data.
  - Serializer: Combines data abstraction with control structures, enabling concurrency with atomic operation maintenance.
  - Mutual Exclusion (Mutexes): Allows exclusive resource access via locks.
  - Path Expression: Ensures execution order follows specified constraints.
  - Locks: Mechanisms for exclusive access to resources.
    - Types: Binary Lock, Read-Write Lock, Spinlock.
Message Passing Synchronization
- Implements protocols ensuring correct message delivery order.
- Methods:
  - Communicating Sequential Processes (CSP): Involves I/O interactions between sender and receiver processes.
  - Remote Procedure Call (RPC): Facilitates data exchange between caller and called procedures, blocking until completion.
  - Ada Rendezvous: Supports symmetric communication.

Definition: A procedure for reaching agreement in decentralized multi-agent platforms.
Importance: Critical for reliability and fault tolerance in message-passing systems.

Ensures reliability and fault tolerance even amidst faulty individuals.
Example Applications: Database transactions, state machine replication, clock synchronization.

Termination: Every non-faulty process must eventually decide.
Agreement: Non-faulty processes must have identical final decisions.
Validity: Non-faulty processes start and finish with the same value proposed initially.
Integrity: Each correct individual decides at most once on proposed values.

Safety Property: No convergence to incorrect values among correct individuals.
Liveness Property: Guarantees acceptance of every correct value eventually.
Termination Property: Ensures all correct processes eventually decide.
Agreement Property: Guarantees all nodes converge to a single consensus value.
Fault Tolerance: Protocols handle failures, ensuring continued correctness.
Byzantine Fault Tolerance: Ability to tolerate malicious nodes.
Scalability: Ability to handle large networks without loss of properties.

Definition: Communication technology enabling requests for services from another program over a network.
Client-Server Model: The client requests services while the server provides them.
Execution: Synchronous by default, blocking the client until a result is returned.
IDL (Interface Definition Language): Used to define communication between programs on different platforms.

Callback RPC: Involves a peer-to-peer paradigm.
Broadcast RPC: The client’s request is broadcasted and handled by multiple servers.
Batch-mode RPC: Allows multiple requests to be sent in one batch.

Advantages:
- High-level language support for communication.
- Simplifies the client/server programming model.
- Supports thread-oriented models.
Disadvantages:
- Parameter passing is limited to values.
- Prone to failures due to network reliance.
- No standard implementation, leading to flexibility issues.

Types:
- Perhaps Call: Waits for a timeout before proceeding.
- Last-one Call: Uses the last result after timeout.
- Exactly-once Call: Ensures identical procedures even with retransmissions.

RPC supports procedural programming; RMI supports object-oriented programming.
RMI is Java-specific, whereas RPC is not.
RMI allows for object passing, while RPC deals with basic data types.