Synchronization Part B

Looping Topics to be Covered

  • Clock Synchronization

  • Logical Clocks

  • Mutual Exclusion

  • Election Algorithms

  • Distributed Transactions

Synchronization

  • Definition: Coordination with respect to time, involving the ordering of events and execution of instructions.

  • Context: In distributed systems (DSs), entities often need to cooperate and synchronize to solve problems correctly.

    • Examples:

      • In a distributed file system, processes must synchronize to avoid concurrent writes to the same file.

      • Coordinating access to shared resources, e.g., printers.

      • Agreeing on events' order, such as when message m1 from process P was sent relative to message m2 from process Q.

Need for Synchronization – Examples

  • Vehicle tracking in a city surveillance system: Utilizes a distributed sensor network of cameras.

  • Writing a file in a Distributed File System:

    • Scenario:

      • Client A writes "data1" to abc.txt at offset 0.

      • Client B attempts to write "data2" at offset 1.

      • Client C writes "data3" at the same offset.

    • Issue: Without synchronization, data corruption can occur.

Clock Synchronization

  • Definition: Mechanism to sync time across computers in a DS, crucial for knowing event occurrences and order.

  • Importance: Time management becomes complex in distributed systems where each computer has its own clock.

    • Clocks drift due to imperfect synchronization, causing out-of-sync issues.

Types of Clock Synchronization

  • External Clock Synchronization: Uses an external reference clock for all nodes to adjust their times.

  • Internal Clock Synchronization: Each node shares time with others, adjusting accordingly.

Coordinated Universal Time (UTC)

  • UTC Definition: Standard time to which all computers are generally synchronized.

    • Uses: Popular synchronization protocols in distributed systems reference UTC.

    • Implementation: UTC receivers enable computer systems to sync via satellite signals.

Tracking Time on a Computer

  • Mechanism: Each computer has a hardware timer.

    • Issues: Hardware timers can be imprecise (e.g., counting time faster or slower), leading to clock skew.

Clock Skew

  • Definition: Difference between two clocks at a given time.

  • Types of Clocks:

    1. Perfect Clock: Tick rate is normal (dC/dt = 1).

    2. Fast Clock: Tick rate is above normal (dC/dt > 1).

    3. Slow Clock: Tick rate is below normal (dC/dt < 1).

Clock Skew Frequency

  • Calculation: Frequency defined as the ratio of net seconds counted by the software clock vs. actual UTC seconds.

  • Notation: For a clock at time t, where UTC is referenced.

Drifting of Clocks

  • Correction Techniques:

    1. External Synchronization: Aligns with real-time clocks.

    2. Mutual (Internal) Synchronization: Achieves consistent time visually across all nodes.

Dealing with Drift

  • Strategies:

    • Avoiding backward adjustments to prevent confusion in message ordering.

    • Gradually correcting clock speed if it is fast or slow using OS instructions for interrupts.

Physical and Logical Clocks

  • Physical Clocks: Needed for timestamping events and adjusting time across nodes.

  • Logical Clocks: Prioritize the order of events over actual timestamps.

    • Focus on messages over time.

    • Types:

      1. Lamport’s Logical Clock

      2. Vector Clock

Clock Synchronization Algorithms

  • Centralized: Relies on a single time server (e.g., Berkeley Algorithm, Passive and Active Time Servers).

  • Distributed: No central server; uses local times averaged across nodes (e.g., NTP, Global and Local Averaging Algorithms).

Cristian's Algorithm

  • Overview: Synchronizes networked computers with a centralized time server.

    • Mechanism: Clients request synchronization, compensating for network delays.

Berkeley Algorithm Basics

  • Purpose: Synchronizes distributed nodes without a reliable time source.

    • Involves a master node that collects time from slave nodes and computes an average.

  • Implementation:

    • Pass 1: Time daemon asks for timestamps.

    • Pass 2: Slave nodes respond with timestamps.

    • Pass 3: Master computes an average.

    • Pass 4: This average is broadcast to nodes for adjustment.

Logical Clocks

  • Definition: Maintain an order of events without precise timing.

    • Most effective in scenarios without interaction between systems.

    • Two major types: Lamport's Clock and Vector Clock.

Properties of Logical Clocks

  • Guarantees that if one event precedes another (in a single system), it’s clock value reflects that order.

  • Ensures no clock value falls backwards in time.

Synchronizing Logical Clocks

  • Challenge: Discrepancies may arise from communication between processes running at different rates.

Lamport’s Clock Algorithm

  • Processes: Each process maintains a local counter, incrementing upon events and assigning timestamps to messages sent and received.

Vector Clocks

  • Structure: Maintains a vector of logical clocks for all processes, allowing for causal order verification.

  • Updating Methodology: Each event updates the local vector clock and message passing includes this vector for tracking.

Enforcing Causal Communication

  • In multicast settings, messages are held until conditions for causality are satisfied.

Mutual Exclusion in Distributed Systems

  • Definition: Ensures that only one process can access shared resources at a time.

  • Implementation: Primarily through message passing, as traditional semaphores are not feasible.

Critical Section and Requirements

  • Critical Sections: Parts of programs needing exclusive access.

    • Requirements:

      • Mutual exclusion to prevent concurrent access.

      • No starvation; every request must eventually be satisfied.

Distributed Mutual Exclusion Types

  1. Permission-based Approaches: Request permission from coordinators.

  2. Token-based Approaches: A token circulates among processes, which can only access the resource with it.

Centralized Mutual Exclusion Algorithm

  • Process: A single coordinator controls access to resources based on requests.

    • Policy: First-come-first-served.

    • Consideration: Vulnerability to single points of failure.

Election Algorithms in Distributed Systems

  • Purpose: To choose a coordinator among processes, which may become necessary if the current coordinator fails.

  • Mechanism: Any process can initiate an election when necessary.

Election Algorithms Types

  1. Bully Algorithm: Higher ID process becomes the new coordinator.

  2. Ring Algorithm: Processes arranged in a logical ring to communicate and elect a new coordinator.

Global State

  • Definition: The set of local states across all processes and communication channels.

  • Use: Vital for identifying deadlocks and ensuring proper termination of distributed computations.

Transactions Overview

  • Definition: A series of operations on data treated as a single atomic operation.

  • Significance: Derived from the database world to maintain data consistency through ACID properties.

ACID Properties**

  1. Atomicity: Transaction is all-or-nothing.

  2. Consistency: Database maintains validity before and after a transaction.

  3. Isolation: Concurrent transactions do not interfere.

  4. Durability: Successful transactions persist despite failures.

Concurrency Control**

  • Objective: Allow simultaneous transaction execution without interference through mechanisms like locking and two-phase locking (2PL).

Distributed Transactions**

  • Definition: Involve multiple servers; ensures a transaction's atomicity across systems.

Distributed Commit Protocols**

  1. Voting Phase: Workers vote on commit.

  2. Completion Phase: Workers either commit or abort based on the coordinator's decision.

Types of Distributed Transactions**

  1. Flat Transaction: A single operation across all servers.

  2. Nested Transaction: Hierarchical transactions that can contain subtransactions.