Lecture 7

Synchronization in Distributed Systems

  • Definition: A distributed system comprises a collection of computers connected via a high-speed communication network.

  • Communication: Hardware and software components in such systems communicate and coordinate their actions through message passing.

  • Resource Sharing: Each node can share its resources, necessitating proper resource allocation to maintain resource states and process coordination.

  • Synchronization: Essential for resolving conflicts and ensuring coordinated process execution in distributed systems.

Clock Synchronization

  • Purpose: Achieved via clocks to maintain time consistency across nodes.

  • UTC: Nodes set their local time based on UTC (Universal Time Coordinated).

Types of Clock Synchronization

  1. External Clock Synchronization: Uses an external reference clock to adjust the time of nodes.

  2. Internal Clock Synchronization: Nodes share their local time to adjust each other’s time.

Clock Synchronization Algorithms

  1. Centralized Algorithms:

    • A single time server propagates time to nodes.

    • Examples: Berkeley Algorithm, Passive Time Server, Active Time Server.

    • Drawback: A single point of failure - if the server fails, synchronization fails.

  2. Distributed Algorithms:

    • No centralized server; nodes average their local time.

    • Advantages: Overcomes scalability issues and single points of failure.

    • Examples: Global Averaging Algorithm, Localized Averaging Algorithm, NTP (Network Time Protocol).

Synchronization Mechanism in Distributed Systems

  • Importance: Ensures coordinated access to shared resources and inter-process communication.

Categories of Synchronization Mechanisms

  1. Shared Memory Synchronization

    • Ensures nodes access shared data without inconsistencies.

    • Methods:

      • Semaphore: A shared integer variable with operations for acquiring and releasing locks.

      • Monitor: High-level data type abstraction for managing shared data.

      • Serializer: Combines data abstraction with control structures, enabling concurrency with atomic operation maintenance.

      • Mutual Exclusion (Mutexes): Allows exclusive resource access via locks.

      • Path Expression: Ensures execution order follows specified constraints.

      • Locks: Mechanisms for exclusive access to resources.

        • Types: Binary Lock, Read-Write Lock, Spinlock.

  2. Message Passing Synchronization

    • Implements protocols ensuring correct message delivery order.

    • Methods:

      • Communicating Sequential Processes (CSP): Involves I/O interactions between sender and receiver processes.

      • Remote Procedure Call (RPC): Facilitates data exchange between caller and called procedures, blocking until completion.

      • Ada Rendezvous: Supports symmetric communication.

Distributed Consensus in Distributed Systems

  • Definition: A procedure for reaching agreement in decentralized multi-agent platforms.

  • Importance: Critical for reliability and fault tolerance in message-passing systems.

Features

  • Ensures reliability and fault tolerance even amidst faulty individuals.

  • Example Applications: Database transactions, state machine replication, clock synchronization.

Conditions for Achieving Consensus

  • Termination: Every non-faulty process must eventually decide.

  • Agreement: Non-faulty processes must have identical final decisions.

  • Validity: Non-faulty processes start and finish with the same value proposed initially.

  • Integrity: Each correct individual decides at most once on proposed values.

Correctness of Distributed Consensus Protocols

  • Safety Property: No convergence to incorrect values among correct individuals.

  • Liveness Property: Guarantees acceptance of every correct value eventually.

  • Termination Property: Ensures all correct processes eventually decide.

  • Agreement Property: Guarantees all nodes converge to a single consensus value.

  • Fault Tolerance: Protocols handle failures, ensuring continued correctness.

  • Byzantine Fault Tolerance: Ability to tolerate malicious nodes.

  • Scalability: Ability to handle large networks without loss of properties.

Applications of Distributed Consensus

  • Leader election in fault-tolerant environments.

  • Consistency maintenance in distributed networks.

  • Blockchain technology for shared databases.

  • Load balancing across system nodes.

Remote Procedure Call (RPC) Mechanism

  • Definition: Communication technology enabling requests for services from another program over a network.

  • Client-Server Model: The client requests services while the server provides them.

  • Execution: Synchronous by default, blocking the client until a result is returned.

  • IDL (Interface Definition Language): Used to define communication between programs on different platforms.

Workflow of RPC

  1. Parameters are prepared by the caller.

  2. Control is passed to the remote procedure in a new execution environment.

  3. Results are returned to the caller after execution.

Types of RPC

  • Callback RPC: Involves a peer-to-peer paradigm.

  • Broadcast RPC: The client’s request is broadcasted and handled by multiple servers.

  • Batch-mode RPC: Allows multiple requests to be sent in one batch.

Advantages and Disadvantages of RPC

  • Advantages:

    • High-level language support for communication.

    • Simplifies the client/server programming model.

    • Supports thread-oriented models.

  • Disadvantages:

    • Parameter passing is limited to values.

    • Prone to failures due to network reliance.

    • No standard implementation, leading to flexibility issues.

Call Semantics in RPC

  • Types:

    • Perhaps Call: Waits for a timeout before proceeding.

    • Last-one Call: Uses the last result after timeout.

    • Exactly-once Call: Ensures identical procedures even with retransmissions.

Differences Between RPC and RMI

  • RPC supports procedural programming; RMI supports object-oriented programming.

  • RMI is Java-specific, whereas RPC is not.

  • RMI allows for object passing, while RPC deals with basic data types.

Differences Between RMI and DCOM

  • RMI leverages Java technologies, while DCOM is Microsoft-centric.

  • DCOM enables language independence and various network protocols.

Differences Between RMI and CORBA

  • RMI is Java-specific; CORBA encompasses multiple languages.

  • RMI automatically manages garbage collection; CORBA does not.

  • RMI's objects can download new classes; CORBA lacks this capability.