Lecture 7
Synchronization in Distributed Systems
Definition: A distributed system comprises a collection of computers connected via a high-speed communication network.
Communication: Hardware and software components in such systems communicate and coordinate their actions through message passing.
Resource Sharing: Each node can share its resources, necessitating proper resource allocation to maintain resource states and process coordination.
Synchronization: Essential for resolving conflicts and ensuring coordinated process execution in distributed systems.
Clock Synchronization
Purpose: Achieved via clocks to maintain time consistency across nodes.
UTC: Nodes set their local time based on UTC (Universal Time Coordinated).
Types of Clock Synchronization
External Clock Synchronization: Uses an external reference clock to adjust the time of nodes.
Internal Clock Synchronization: Nodes share their local time to adjust each other’s time.
Clock Synchronization Algorithms
Centralized Algorithms:
A single time server propagates time to nodes.
Examples: Berkeley Algorithm, Passive Time Server, Active Time Server.
Drawback: A single point of failure - if the server fails, synchronization fails.
Distributed Algorithms:
No centralized server; nodes average their local time.
Advantages: Overcomes scalability issues and single points of failure.
Examples: Global Averaging Algorithm, Localized Averaging Algorithm, NTP (Network Time Protocol).
Synchronization Mechanism in Distributed Systems
Importance: Ensures coordinated access to shared resources and inter-process communication.
Categories of Synchronization Mechanisms
Shared Memory Synchronization
Ensures nodes access shared data without inconsistencies.
Methods:
Semaphore: A shared integer variable with operations for acquiring and releasing locks.
Monitor: High-level data type abstraction for managing shared data.
Serializer: Combines data abstraction with control structures, enabling concurrency with atomic operation maintenance.
Mutual Exclusion (Mutexes): Allows exclusive resource access via locks.
Path Expression: Ensures execution order follows specified constraints.
Locks: Mechanisms for exclusive access to resources.
Types: Binary Lock, Read-Write Lock, Spinlock.
Message Passing Synchronization
Implements protocols ensuring correct message delivery order.
Methods:
Communicating Sequential Processes (CSP): Involves I/O interactions between sender and receiver processes.
Remote Procedure Call (RPC): Facilitates data exchange between caller and called procedures, blocking until completion.
Ada Rendezvous: Supports symmetric communication.
Distributed Consensus in Distributed Systems
Definition: A procedure for reaching agreement in decentralized multi-agent platforms.
Importance: Critical for reliability and fault tolerance in message-passing systems.
Features
Ensures reliability and fault tolerance even amidst faulty individuals.
Example Applications: Database transactions, state machine replication, clock synchronization.
Conditions for Achieving Consensus
Termination: Every non-faulty process must eventually decide.
Agreement: Non-faulty processes must have identical final decisions.
Validity: Non-faulty processes start and finish with the same value proposed initially.
Integrity: Each correct individual decides at most once on proposed values.
Correctness of Distributed Consensus Protocols
Safety Property: No convergence to incorrect values among correct individuals.
Liveness Property: Guarantees acceptance of every correct value eventually.
Termination Property: Ensures all correct processes eventually decide.
Agreement Property: Guarantees all nodes converge to a single consensus value.
Fault Tolerance: Protocols handle failures, ensuring continued correctness.
Byzantine Fault Tolerance: Ability to tolerate malicious nodes.
Scalability: Ability to handle large networks without loss of properties.
Applications of Distributed Consensus
Leader election in fault-tolerant environments.
Consistency maintenance in distributed networks.
Blockchain technology for shared databases.
Load balancing across system nodes.
Remote Procedure Call (RPC) Mechanism
Definition: Communication technology enabling requests for services from another program over a network.
Client-Server Model: The client requests services while the server provides them.
Execution: Synchronous by default, blocking the client until a result is returned.
IDL (Interface Definition Language): Used to define communication between programs on different platforms.
Workflow of RPC
Parameters are prepared by the caller.
Control is passed to the remote procedure in a new execution environment.
Results are returned to the caller after execution.
Types of RPC
Callback RPC: Involves a peer-to-peer paradigm.
Broadcast RPC: The client’s request is broadcasted and handled by multiple servers.
Batch-mode RPC: Allows multiple requests to be sent in one batch.
Advantages and Disadvantages of RPC
Advantages:
High-level language support for communication.
Simplifies the client/server programming model.
Supports thread-oriented models.
Disadvantages:
Parameter passing is limited to values.
Prone to failures due to network reliance.
No standard implementation, leading to flexibility issues.
Call Semantics in RPC
Types:
Perhaps Call: Waits for a timeout before proceeding.
Last-one Call: Uses the last result after timeout.
Exactly-once Call: Ensures identical procedures even with retransmissions.
Differences Between RPC and RMI
RPC supports procedural programming; RMI supports object-oriented programming.
RMI is Java-specific, whereas RPC is not.
RMI allows for object passing, while RPC deals with basic data types.
Differences Between RMI and DCOM
RMI leverages Java technologies, while DCOM is Microsoft-centric.
DCOM enables language independence and various network protocols.
Differences Between RMI and CORBA
RMI is Java-specific; CORBA encompasses multiple languages.
RMI automatically manages garbage collection; CORBA does not.
RMI's objects can download new classes; CORBA lacks this capability.