25CSCI071 Distributed Systems: Complete Lecture Notes and Study Guide

Module Overview and Logistics

Course Code and Title: 25CSCI071 Distributed Systems.
Instructor: Prof. Gerard McKee.
Module Composition:
- Lectures: Activities to ensure understanding of key concepts; accompanied by slides prepared by the Module Leader.
- Laboratories: Focus on practical experience with techniques and tools including client-server programming, RMI (Remote Method Invocation), and general distributed computing (e.g., master-slave systems).
- Assignments: Two assignments, each worth 20%, totaling 40% of the module mark.
- Examination: A 2-hour unseen examination paper worth 60%.
Primary Teaching Material:
- Book: Distributed Systems, 4th Edition, 2025, Version 4.03 (January 2025) by Maarten van Steen and Andrew S. Tanenbaum.
Expectations:
- Laptops are required with NetBeans installed to support Java programming.
- Software engineering skills are to be improved through design and programming practice.
Lab Reorganization (Weeks 3-5):
- Designed for Assignment One preparation.
- ChatGPT usage is permitted for design and coding help in some sessions but explicitly prohibited during the in-lab invigilated submission sessions for both Assignment One and Assignment Two.
- Peer discussion is allowed for proposed designs and code.

Introduction to Distributed Systems

Common Definition: A collection of autonomous computing elements that appears to users to be a single coherent system.
Computing Elements (Nodes):
- Can be high-performance computers, desktops, laptops, tablets, mobile phones, or sensors.
- Nodes act independently but aim to achieve common goals by exchanging messages.
Interpretation Views:
- Single System View: The end user should not know processes, data, and control are distributed. This is noted as difficult to achieve.
- Single Coherent View: More realistic; the system appears coherent if it operates the same regardless of where, when, or how the user interacts with it.
- User Transparency: Users should not discern where a process executes, if tasks are split, where data is stored, or if data is replicated.
Integrative vs. Expansive Views:
- Integrative: Need to connect existing networked systems together (e.g., grid computing platform).
- Expansive: Need to expand an existing system to include resources closer to users for performance (e.g., edge computing).
Decentralized vs. Distributed Systems:
- Decentralized: A system where processes and resources are necessarily spread across multiple computers (e.g., Federated learning, Distributed Ledger, Sensor networks).
- Distributed: A system where processes and resources are sufficiently spread (e.g., Google Mail, Content Delivery Networks (CDN), Network Attached Storage).

Perspectives for Organizing Distributed Systems

Architectural: Understanding how components interact and common system styles.
Process: Different forms of processes (threads, virtualization, clients, servers) forming the software backbone.
Communication: Facilities for data exchange between processes (mimicking procedure calls, high-level message passing).
Coordination: Fundamental tasks happening "under the hood" to manage joint efforts.
Naming: Resolving an entity name to an access point (location).
Consistency and Replication: Trade-offs between efficiency and dependability; updates to replicated resources must be managed.
Fault Tolerance: Masking and recovering from failures. Completely masking failures is noted as provably impossible.
Security: Ensuring authorized access, trust, and authentication.

Design Goals and Pitfalls

Six Primary Design Goals:
1. Resource Sharing.
2. Distribution Transparency (Single coherent view).
3. Openness.
4. Dependability (Failure, consistency, and replication).
5. Security.
6. Scalability (Performance).
The Eight Fallacies (Design Pitfalls):
1. The network is reliable (must consider hardware/software redundancy).
2. The network is secure (must build security in from Day 1).
3. The network is homogeneous (networks vary across environments).
4. The topology does not change (must provide location transparency).
5. Latency is zero (time to move data, especially across a Wide Area Network (WAN)).
6. Bandwidth is infinite (packet loss in WANs can lower effective bandwidth).
7. Transport cost is zero (marshalling data adds latency and uses resources).
8. There is one administrator (distributed services often consume external, uncontrolled services).

Classification According to System Use

High Performance Distributed Computing:
- Cluster Computing: Homogeneous; similar compute nodes connected via high-speed network, usually running the same OS.
- Grid Computing: Heterogeneous; decentralized federation across different administrative domains and organizations.
- Grid Layers: Fabric (local interfaces), Connectivity (communication/authentication), Resource (single resource management), Collective (discovery/scheduling), and Application.
Distributed Information Systems:
- Distributed Transaction Processing: Uses ACID properties: Atomic (indivisible), Consistent (no invariant violation), Isolated (no mutual interference), Durable (permanent commits).
- Transaction Primitives: BEGIN_TRANSACTION, END_TRANSACTION, ABORT_TRANSACTION, READ, WRITE.
- Nested Transactions: Multi-level transactions where sub-transactions can run in parallel. A TP (Transaction Processing) Monitor coordinates commitment using a distributed commit protocol.
- Enterprise Application Integration (EAI): Middleware facilitating direct communication between diverse applications independently of their databases via RPC or Messaging.
Distributed Pervasive Systems:
- Ubiquitous Computing: Continuous user-system interaction (features: networked devices, unobtrusive interaction, context awareness, autonomy, intelligence).
- Mobile Computing: Emphasis on mobility and discovery of local services.
- Sensor Networks: Geographically distributed collaboration of simple, often battery-powered nodes.

Seven Types of Distribution Transparency

Access: Hides differences in data representation and access methods.
Location: Hides the physical location of an object (e.g., using URLs like www.bue.edu.eg).
Relocation: Hides the movement of an object while it is in use.
Migration: Hides that an object may move during its lifecycle to a different location.
Replication: Hides that multiple copies of a resource exist for performance or availability.
Concurrency: Hides that a resource is shared by multiple independent users (requires locking to maintain consistency).
Failure: Masks the failure and recovery of objects. The greatest challenge is distinguishing between a dead process and a slow one.

Openness and Dependability

Open Distributed Systems: Offer components easily integrated into other systems.
- Interoperability: Different manufacturers' systems working together.
- Composability: Applications running on System B as they did on System A without modification.
- Extensibility: Ability to replace or add components without affecting the whole.
- IDL (Interface Definition Language): Used to specify service syntax precisely.
Separation of Policy and Mechanism: Mechanism is the facility for storing data (e.g., web caching); policy is the rule for what data is stored and for how long.
Dependability Metrics:
- Availability: Probability system is correct at a given instant.
- Reliability: Probability system works without failure for a long period.
- Safety: Absence of catastrophic events during temporary failure.
- Maintainability: Ease of repair.
- Standard Measures: Mean Time To Failure ( $MTTF$ ), Mean Time To Repair ( $MTTR$ ), and Mean Time Between Failures ( $MTBF = MTTF + MTTR$ ).
Faults: Can be Transient (once), Intermittent (reappears), or Permanent.

Scalability and Scaling Techniques

Types of Scalability: Size (users/resources), Geographic (communication delays), and Administrative (multiple domains).
Scaling Up: Vertical scaling; increasing CPU, network, or storage of a single machine.
Scaling Out: Horizontal scaling; hiding latencies, partitioning, distribution, and replication.
- Hiding Latencies: Using asynchronous communication or client-side processing (e.g., checking forms on the client).
- Partitioning (Distribution): Splitting a component; example: Domain Name System (DNS) is partitioned into zones ( $Z1, Z2, Z3$ ) to avoid bottlenecks.
- Replication/Caching: Placing copies close to users. Caching is a client-side decision; replication is a system-side decision. Caching can lead to consistency issues when data is modified.

Architectural Styles

Layered Architecture: Components organized in layers where Layer $N$ can call down to Layer $N-1$ .
- Upcalls: Lower layers inform higher layers of events via registered handlers.
- Protocol Stacks: Multi-layered, each layer conforming to a protocol (e.g., TCP/IP using sockets at the interface).
- Application Layering: Application-interface level, Processing level (core functionality), and Data level.
Service-Oriented Architecture (SOA): Independent entities executing as separate threads/processes.
- Object-based: Encapsulated data/state; connectors are RPCs.
- Microservices: Many small, independent programs composed to form larger ones; requires orchestration.
- Resource-based (REST): View of system as a collection of resources identified by URIs. All services use a standard interface: PUT (create), GET (retrieve), DELETE (delete), POST (modify). RESTful execution is stateless.
Publish-Subscribe Architecture: Processes publish information; others subscribe via middleware.
- Decoupling: Can be referentially decoupled (don't know name of other process) or temporally decoupled (don't need to be running at the same time).
- Event-based: Temporally coupled/Referentially decoupled.
- Shared Data Space: Temporally decoupled/Referentially decoupled using tuples (structured data records).
- Subscription Types: Topic-based (logical channels) or Content-based (attributes/ranges).

System Architectures

Centralized (Vertical Distribution): Multi-tiered (Client, Application, Database).
- Configurations: Thin client (UI software runs on client), or fat client (application and data storage on client).
Decentralized (Horizontal Distribution/Peer-to-Peer): All processes have the same functions; interaction is symmetric.
- Structured P2P: Overlay network follows deterministic topology (Ring, Binary Tree, Grid). Uses a Distributed Hash Table (DHT); $Node = lookup(key)$ .
- Unstructured P2P: Ad hoc list of neighbors; overlay resembles a random graph.
- Unstructured Search: Flooding (send to all, use Time To Live/TTL), Random Walks (random selection), or Policy-based (record of peer reliability).
Hierarchical P2P: Uses Super Peers (stronger nodes) to maintain indexes of data items for "weak peers."
BitTorrent Case Study:
- Goal: Peer collaboration (tit-for-tat bartering).
- Roles: Tracker (maintains active node record), Seeders (have complete file), Leechers (downloading), Swarm (peers with pieces).
- Content injection via .torrent metadata file.
Hybrid Architectures:
- Cloud Computing: Virtualized resources; four levels (Hardware, Infrastructure, Platform, Application). Service models include IaaS, PaaS, SaaS, and FaaS (Function-as-a-Service).
- Edge-Cloud: Placement of services at the network edge to reduce latency and bandwidth issues; requires orchestration (resource allocation, service placement, and selection).
- Blockchain (Distributed Ledgers): Unforgeable, append-only chain of blocks. Hash of Block $B_i$ is placed in Block $B_{i+1}$ . Appending requires consensus: Centralized (single entity), Permissioned (group consensus), or Permissionless (leader election).

Processes and Virtualization

Virtualization: Making one resource appear as another (e.g., single CPU as multi-processor).
Interfaces: API (Library calls), System Calls (OS), and Instruction Set Architecture (privileged/general instructions).
VMM Types:
- Process VM: For a single process (e.g., Java Runtime Environment).
- Native VMM: Directly on hardware; runs multiple guest OSes.
- Hosted VMM: Runs on top of a host OS (popular in modern DS).
Threads: Runs within a process. Multithreading allows a client to remain responsive while parts of the program (e.g., downloading images in a browser) block.
Server Models:
- Single-threaded: Processes one request to completion; idle during disk I/O.
- Multi-threaded (Dispatcher/Worker): Dispatcher reads requests and hands them to idle workers who can block without stopping the server.

Communication: RPC and MOM

Communication Types: Persistent (stored by middleware, e.g., email) vs. Transient (dropped if not delivered; sockets); Synchronous (client blocks) vs. Asynchronous (client continues).
Remote Procedure Call (RPC):
- Client calls a Client Stub (local call). Stub marshals parameters into a message.
- Server receives message, Server Stub unmarshals and calls the server procedure.
- Parameter Passing: Neutral formats are required to handle Endianness (Little Endian vs. Big Endian). Big Endian is standard for network transmission.
- Reference Parameters: Replaced by "copy-by-value and restore."
Message Oriented Middleware (MOM): Persistent communication; message queuing systems.
- Primitives: put (append/non-blocking), get (remove/blocks if empty), poll (remove/never blocks), notify (handler installation).
- Architectural Features: Queue Managers, Routers (overlay network relays), and Message Brokers (converters for heterogeneous formats used in EAI).

Coordination and Election Algorithms

Mutual Exclusion:
- Centralized Problem: Coordinator grants permission. Beneficial for fairness but a single point of failure.
- Distributed Algorithm: Process multicasts a request with a timestamp. Requires $2(N-1)$ messages to enter a critical region.
- Token-Ring: Logical ring; process must hold circulating token to access resource. Prevents starvation but needs token regeneration if lost.
Election Algorithms:
- Bully Algorithm: Process initiates election by messaging higher-ID processes. Highest ID currently running wins ( $ID_{Pk} = k$ ).
- Ring Algorithm: ELECTION message circulates ring; highest ID in the list is elected.
- Wireless Elections: Source-initiated build-tree phase; capacity/resources are reported back to select the best leader.
- Large-Scale Elections: N-tokens represent super peers. Repulsive forces Move tokens away from each other to ensure even distribution across an $m$ -dimensional geometric space.

Naming Systems and Resolution

Name, Address, Identifier:
- Address: Access point; an entity can have multiple.
- Identifier: Refers to at most one entity; never reused.
Flat Naming: Unstructured random strings (e.g., UUIDs).
- Resolution: Broadcasting (LANs), Multicasting (location groups), or Forwarding pointers.
- Hierarchical Location Service: Network split into domains (Leaf to Root). Directory nodes store pointers to sub-domains.
- Home-based (Mobile IP): Entity has a fixed home IP. Requests are tunneled to a "care-of" IP if the host moves.
Structured Naming: Labelled directed graphs.
- Resolution Mechanisms: Iterative (resolver contacts each name server) vs. Recursive (name server resolves on behalf of the client).
- DNS Layers: Global (organizations), Administrational (single organization), and Managerial (individual hosts/files).
DHTs and Chord:
- Identifiers and keys mapped onto a circle (modulo $2^m$ ).
- Successor Rule: Key $k$ is assigned to $successor(k)$ , the first node with $ID ≥ k$ .
- Finger Tables: Routing tables for fast lookup. Entry $i$ at node $n$ points to $successor(n + 2^{i-1}) \text{ mod } 2^m$ .

Federated Learning

Context: Developed to address privacy concerns in Machine Learning by not concentrating large datasets in central data centers.
Process:
1. Clients download global model.
2. Clients train local model on local data.
3. Clients upload model weights/parameters.
4. Server aggregates weights to produce a new global model.
Modes:
- Cross-device: Large number of mobile/IoT devices.
- Cross-silo: Small number of large organizations.
- Horizontal FL: Data has same features, different samples.
- Vertical FL: Data has same samples, different features.
Aggregation Types: Synchronous, Asynchronous (addresses device heterogeneity), Hierarchical (edge layer), and Robust (differential privacy/encryption).
Algorithm: FedAvg is the original model where client parameters are weighted by their data volume ratio compared to the total volume.
Challenges: Statistical heterogeneity (Non-IID data), communication bottlenecks, and security against data smuggling attacks during aggregation.

Technical Formulas and Data Summary

DHT Finger Table Calculation: $n + 2^{i-1} \text{ mod } 2^m$ .
Availability Probability: Instantaneous measurement.
Reliability Probability: Interval measurement.
ACID Properties: Atomic, Consistent, Isolated, Durable.
Blockchain Hash Linking: $Hash(Block_i) \to Block_{i+1}$ .
Distributed Mutual Exclusion Overhead: $2(N-1)$ messages.
Symmetric Cryptosystem: $DK = EK$ where $Data = DK(EK(Data))$ .
Asymmetric Cryptosystem: Public Key ( $PK$ ) and Private/Secret Key ( $SK$ ).
Secure Hash Function: $H(data)$ returning a fixed-length string where finding $data$ from $H(data)$ is computationally impossible.