Logical Time in Distributed Computing

Origins and Significance of Logical Time

  • Foundational Research: The concept of logical time originated from the seminal 1978 paper by Leslie Lamport titled ‐‐Time, Clocks, and the Ordering of Events in a Distributed System,‑‑ published in Communications of the ACM.

  • Continued Relevance: The topic remains a central interest in distributed computing. A notable recent contribution is the paper ‐‐Capturing Causality in Distributed Systems‑‑ by Raynal and Singhal.

  • Lamport's Recognition: Leslie Lamport's original paper received the PODC Influential Paper Award in 2000.

Applications of Logical Time

  • Parallel Computations: Used in visualizations produced by parallel computations to maintain event order.

  • Banker System Algorithm: Logical time is essential for the implementation of the banker system algorithm in distributed environments.

  • Replicated Data Consistency: Used in efficient solutions for the Replicated Log and Replicated Dictionary problems, specifically the work of Wuu and Bernstein.

Background and Distributed System Model

  • System Composition: A distributed computation consists of a set of processes that cooperate and compete to achieve a common goal.

  • Communication Constraints: These processes do not share common global memory and communicate solely by passing messages over a communication network.

  • Event Classification: Actions within a process are modeled as three types of events:     * Internal Events: These occur within a single process and affect only that process. Logic dictates that events at a process are linearly ordered by their occurrence.     * Message Send Events: Signify the flow of information leaving a process.     * Message Receive Events: Signify the intake of information from another process.

  • Causal Dependency: Send and receive events establish a causal dependency from the sender process to the receiver process.

Causality and the Happen‐Before Relation

  • Causal Precedence Relation: This is a formal concept used for reasoning, analyzing, and drawing inferences about distributed computations. It helps programmers and the system itself solve problems by knowing the relation between processes.

  • Absence of Physical Time: Distributed systems have no built‐in physical time and can only approximate it. Because interactions occur in sporadic "spurts," logical clocks are used to accurately capture causality.

  • Asynchronous Processes: Programs are composed of nn independent, asynchronous processes: p1,p2,,pi,,pnp_1, p_2, \dots, p_i, \dots, p_n. These processes do not share a global clock.

  • Spontaneity: Each process can execute an event spontaneously. When sending a message, a process does not wait for delivery to complete.

  • Internal Sequencing: Each process pip_i produces a sequence of events ei0,ei1,,eix,ei,x+1,e_{i0}, e_{i1}, \dots, e_{ix}, e_{i,x+1}, \dots. These are totally ordered by sequencing:     * eixei,x+1e_{ix} \rightarrow e_{i,x+1} (Read: "eixe_{ix} happens before ei,x+1e_{i,x+1}").     * The relation is transitive: eiieije_{ii} \rightarrow e_{ij} for all i < j.

  • Inter‐Process Causality: For every message mm exchanged between two processes PiP_i and PjP_j:     * If eix=extsend(m)e_{ix} = ext{send}(m) and ejy=extreceive(m)e_{jy} = ext{receive}(m), then eixejye_{ix} \rightarrow e_{jy}.

  • Event Ordering Summary:     * Local events are totally ordered.     * Causality‐linked events are totally ordered.     * Unrelated events are unordered or concurrent (e1e2e_1 || e_2).     * For any two events e1e_1 and e2e_2, one of three conditions must hold: (i) e1e2e_1 \rightarrow e_2, (ii) e2e1e_2 \rightarrow e_1, or (iii) e1e2e_1 || e_2.

Logical Clock Conditions

  • Clock Assignment: Every event is assigned a timestamp according to a specific protocol.

  • Clock Consistency Condition: A system satisfies this if:     * If e1e2e_1 \rightarrow e_2, then C(e_1) < C(e_2), where C(ei)C(e_i) is the timestamp of event eie_i.

  • Strong Clock Consistency Condition: A clock is strongly consistent if it also satisfies the converse:     * If C(e_1) < C(e_2), then e1e2e_1 \rightarrow e_2.

Scalar Logical Clocks (Lamport Clocks)

  • Protocol Implementation:     * Rule 1 (R1): Before executing any event (send, receive, or internal), process pip_i increments its clock: Ci=Ci+dC_i = C_i + d (where d > 0, usually d=1d = 1).     * Rule 2 (R2): Every message carries the clock value of the sender at the time of sending (CmsgC_{msg}). Upon receiving a message, process pip_i performs:         * Ci=max(Ci,Cmsg)\text{Ci} = \max(C_i, C_{msg}).         * Execute Rule 1 (increment).         * Deliver the message.

  • Properties:     * The logical clock is monotonically increasing.     * It satisfies the clock consistency condition.     * It does not necessarily satisfy the strong clock consistency condition (e.g., C(e_1) < C(e_2) does not prove e1e2e_1 \rightarrow e_2).

Vector Logical Clocks

  • Structure: Each process PiP_i out of nn processes maintains an integer vector vti[1n]vti[1 \dots n].     * vti[i]vti[i] is the local logical clock of pip_i.     * vti[j]vti[j] is pip_i's latest knowledge of PjP_j's local time.

  • Protocol Rules:     * Local Update: Before an event, vti[i]=vti[i]+dvti[i] = vti[i] + d.     * Message Piggybacking: The sender attaches its current vector clock vtvt to the message mm.     * Reception Update: Process PiP_i updates its vector:         * For 1kn:vti[k]=max(vti[k],vt[k])1 \le k \le n: vti[k] = \max(vti[k], vt[k]).         * vti[i]=vti[i]+dvti[i] = vti[i] + d.

  • Strong Consistency: Vector clocks are strongly consistent.     * e_1 \rightarrow e_2 \iff vt(e_1) < vt(e_2).

  • Vector Comparison Logic:     * V_1 < V_2 if for all ii, V1[i]V2[i]V_1[i] \le V_2[i] AND there exists at least one kk such that V_1[k] < V_2[k].     * Example 1: V1=1,2,3,4V_1 = {1, 2, 3, 4}, V_2 = {2, 3, 4, 5} \implies V_1 < V_2.     * Example 2: V1=1,2,3,4V_1 = {1, 2, 3, 4}, V_2 = {2, 2, 4, 4} \implies V_1 < V_2.     * Example 3: V1=1,2,3,4V_1 = {1, 2, 3, 4}, V2=2,3,4,1    V_2 = {2, 3, 4, 1} \implies Unordered/Concurrent.

Matrix Logical Time

  • Development: Proposed by Michael and Fischer in 1982.

  • Structure: Process PiP_i maintains an n×nn \times n matrix mti[1n,1n]mti[1 \dots n, 1 \dots n].     * mti[i,i]mti[i, i] is the local logical clock of PiP_i.     * Row ii corresponds to the vector clock of PiP_i.     * mti[i,j]mti[i, j] is the latest knowledge PiP_i has regarding PjP_j's local clock (mtj[j,j]mtj[j, j]).     * mti[j,k]mti[j, k] represents what PiP_i knows about PjP_j's knowledge of PkP_k's local clock.

  • Protocol Rules:     * Local Event: mti[i,i]=mti[i,i]+dmti[i, i] = mti[i, i] + d.     * Reception from PjP_j:         1. Update PiP_i's knowledge of others via message matrix mtmt: For 1kn:mti[i,k]=max(mti[i,k],mt[j,k])1 \le k \le n: mti[i, k] = \max(mti[i, k], mt[j, k]).         2. Update transitive knowledge for all processes: For 1kn1 \le k \le n, for 1qn:mti[k,q]=max(mti[k,q],mt[k,q])1 \le q \le n: mti[k, q] = \max(mti[k, q], mt[k, q]).         3. mti[i,i]=mti[i,i]+dmti[i, i] = mti[i, i] + d.

  • Comparison Logic:     * M_1 < M_2 if for all i,ji, j, M1[i,j]M2[i,j]M_1[i, j] \le M_2[i, j] AND there exists some k,pk, p such that M_1[k, p] < M_2[k, p].     * Provides strong consistency to determine causal relations.

Replicated Dictionary Problem (Wuu and Bernstein)

  • The Problem: A dictionary is replicated across multiple nodes over an unreliable network. Nodes must maintain independent views that are eventually consistent.

  • Approach: Unlike standard databases using serializability/locking, Wuu and Bernstein propose a logic‐based algorithm.

  • Operations:     * insert(x)\text{insert}(x): Can be issued by only one node.     * delete(x)\text{delete}(x): Can be issued by multiple nodes. Invoked at NiN_i only if xx is currently in internal view ViV_i.

  • Dictionaries and History:     * exe_x: The unique insertion event.     * xdeletex‐delete event: An event deleting xx.     * Logic: xx is in view V(e)V(e) iff exee_x \rightarrow e and there exists no xdeletex‐delete event gg such that geg \rightarrow e.

  • The Log: Each node maintains a log LL of events. Each log entry eRe_R contains: (operation, time, nodeID). Example: (add a, 3, 2).

Wuu and Bernstein Solutions

  • Trivial Fault-Tolerant Solution:     * Every event adds a record to local log LiL_i.     * Every message includes the entire log LiL_i.     * Reception involves applying all events in the log to the dictionary view VjV_j.     * Drawbacks: Logs grow unboundedly; excessive communication, computation, and storage costs.

  • Improved Solution using Matrix Time:     * Uses matrix clocks TiT_i to purge records seen by all participants.     * The hasrec Predicate: boolean hasrec(Ti, eR, k) returns true if Ti[k, eR.node] > eR.time. This determines if process kk has already seen record eRe_R.     * Node Initialization: Dictionary view Vi=V_i = {}, Partial Log Pli=Pl_i = {}, all matrix values = 0.

  • Operational Rules for Improved Solution:     * On locally issuing insert/delete: Update matrix clock, add event to PliPl_i, update ViV_i.     * On sending to node kk: Create subset NPNP of PliPl_i containing only entries where hasrec(Ti, eR, k) is false. Send NPNP and matrix clock TiT_i to node kk.     * On receiving from node kk:         1. Extract subset NENE from received log where hasrec(Ti, eR, i) is false (new events for node ii).         2. Update ViV_i based on NENE.         3. Update matrix clock TiT_i.         4. Add to PliPl_i those records from the received log where hasrec(Ti, eR, j) is false for at least one participant jj. This ensures the record is kept until everyone has seen it.

  • Result: Efficiently minimizes log size in messages and memory while remaining fault‐tolerant and ensuring eventual consistency.