Chapter 1 Study Notes – Distributed System Models & Enabling Technologies

Scalable Computing over the Internet

  • Computing has shifted from single-site, single-machine processing (1950s) to massively parallel, distributed, and cloud-based execution (2020s).
  • Modern systems are data-intensive and network-centric; billions of people now access services that rely on large data centers and supercomputers.
  • Key concerns: scalability, performance, availability, security, and energy efficiency.

Platform Evolution & Computing Paradigms

  • Five overlapping generations of computer platforms (≈20-year spans):
    • 1950-1970 Mainframes (IBM 360, CDC 6400)
    • 1960-1980 Minicomputers (DEC PDP-11, VAX)
    • 1970-1990 Personal computers (VLSI microprocessors)
    • 1980-2000 Portable & pervasive devices (wired/wireless)
    • 1990-present HPC & HTC clusters, grids, clouds, IoT
  • Trend: leverage shared web resources and massive data over the Internet.

High-Performance vs High-Throughput Computing

  • HPC pursues raw speed (Gflops→Pflops) for science/engineering simulations.
  • HTC pursues high task‐throughput for web search, social networks, e-commerce; emphasises cost, energy, reliability, and security.
  • Resulting infrastructure upgrades: fast servers, SSD/HDD arrays, high-bandwidth networks.

Emerging Paradigms: SOA, Virtualization, Cloud & IoT

  • SOA enables Web 2.0 and mash-up services.
  • Virtualization underpins elastic resource provisioning in clouds.
  • Cloud computing provides utility-style access to compute/storage/network via data centers.
  • RFID, GPS, and sensor miniaturisation lead to the Internet of Things (IoT) and cyber-physical systems (CPS).

Computing Paradigm Distinctions

  • Centralised computing: all resources in one tightly-coupled system.
  • Parallel computing: multiple processors, either shared or distributed memory, cooperating via shared memory or messages.
  • Distributed computing: autonomous nodes with private memories communicating through a network; information exchange by message passing.
  • Cloud computing: virtualised, elastic pools of resources; can be centralised or distributed, parallel or not, often priced as a utility.
  • Concurrent, ubiquitous, and Internet computing are umbrella terms covering mixtures of the above.

Distributed System Families & Design Objectives

  • Clusters, grids, peer-to-peer (P2P) networks, and Internet clouds distinguish themselves by hardware, OS, algorithms, protocols, and service models.
  • Future HPC & HTC systems must meet four design goals:
    1. Efficiency (speed, programming ease, throughput/watt\text{throughput}/\text{watt})
    2. Dependability (reliability, self-management, QoS under failures)
    3. Adaptation (billions of job requests, virtualised resources, variable workloads)
    4. Flexibility (run both scientific & business workloads effectively)

Technology Trends & Degrees of Parallelism

  • Moore’s law: transistor count/processor speed ≈ doubling every 18 months (historically true 30 yrs).
  • Gilder’s law: network bandwidth ≈ doubles yearly.
  • Degrees of parallelism (DoP):
    • Bit-level (BLP) → word processing
    • Instruction-level (ILP) via pipelining, superscalar, VLIW, multithreading
    • Data-level (DLP) via SIMD/vector/GPU instructions
    • Task-level (TLP) via multicore, CMP
    • Job-level (JLP) in distributed systems/grids/clouds
  • Fine-grain (BLP/ILP/DLP) underpins coarse-grain (TLP/JLP) scalability.

Innovative Applications & Utility Computing

  • Domains driving HPC/HTC: scientific simulations, genomics, weather, e-commerce, banking, telemedicine, traffic control, social networking, cyber-security, e-government, military C2.
  • Economic and practical push toward utility computing – IT resources supplied on-demand, metered, with SLA/QoS guarantees.

Hype Cycle of Emerging Technologies (2010 snapshot)

  • Five stages: Technology Trigger → Peak of Inflated Expectations → Trough of Disillusionment → Slope of Enlightenment → Plateau of Productivity.
  • Cloud computing (2010) just past the peak; predicted 2–5 yrs to productivity.
  • Examples: Consumer-generated media in disillusionment (

Internet of Things (IoT) & Cyber-Physical Systems (CPS)

  • IoT: networked interconnection of everyday objects via RFID, sensors, GPS, IPv6 ( 21282^{128} addresses ); projections: 100 trillion trackable objects, 1,000–5,000 per person.
  • Communication patterns: H2H, H2T, T2T; connections at any time/place.
  • CPS: tight integration of computation, communication & control with the physical world; "3C" feedback loops enabling smart cities, energy grids, etc.

Hardware Technologies for Network-Based Systems

  • CPU evolution: VAX 11/780 (1 MIPS, 10 MHz) → Intel Core i7-990X (≈159 kMIPS, 3.46 GHz).
  • Multicore architectures: private L1 caches, shared L2/L3, many-thread support (e.g., Sun Niagara II: 8 cores × 8 threads = 64 threads).
  • Five micro-architectural execution styles: 4-issue superscalar, fine-grain MT, coarse-grain MT, chip-multiprocessor (CMP), simultaneous MT (SMT).

GPU Computing: Architecture, Programming & Power Efficiency

  • GPUs offer hundreds–thousands of simpler cores; optimised for throughput vs CPU latency.
  • NVIDIA Fermi (2011): 16 SMs × 32 CUDA cores = 512 cores; up to 515 Gflops DP per chip, 82.4 Tflops per board.
  • CUDA programming model: CPU orchestrates, GPU executes massive data-parallel kernels.
  • Power per instruction: CPU ≈ 2 nJ vs GPU ≈ 200 pJ (×10 more efficient); goals for exascale (~101810^{18} flops) require ≈60 Gflops/W per core.

Memory, Storage & Networking Advances

  • DRAM density: 16 Kbit (1976) → 64 Gbit (2011); capacity 4× every ≈3 yrs.
  • Disk drives: 260 MB (1981) → 3 TB (2011); ≈10× every 8 yrs; SSDs add high IOPS, moderate write-cycle limits.
  • Ethernet bandwidth: 10 Mbps (1979) → 1 Gbps (1999) → 40/100 GbE (2011) → ≥1 Tbps predicted.

Virtual Machines & Virtual Infrastructure

  • VM configurations:
    1. Native/bare-metal (hypervisor in privileged mode)
    2. Hosted (VMM in user mode atop host OS)
    3. Dual-mode hybrids
  • VM primitive operations: multiplexing, suspend → storage, resume/provision, live migration.
  • Virtual infrastructure = dynamic mapping of physical compute/storage/network to VM-based applications; enables server consolidation (utilisation 5–15 % → 60–80 %).

Data Center Virtualization & Cloud Design

  • ~43 million servers worldwide (2010); cost composition over 3 yrs: ≈30 % hardware, 60 % utilities & management, 10 % misc.
  • Key design principles:
    • Commodity x86 servers, cheap multi-TB disks, GbE switching.
    • Software handles load-balancing, fault-tolerance, expansion.
    • Four converging enablers: (1) hardware virtualisation & multicore, (2) utility/grid foundations, (3) SOA/Web 2.0, (4) autonomic management.

System Models: Clusters, Grids, P2P Networks & Clouds

  • Clusters: 100s–10,000s homogeneous nodes on SAN/LAN; dominate TOP500 (417/500 systems in 2009).
  • Grids: federated, often heterogeneous clusters/sites; emphasise resource sharing across organisations (TeraGrid, EGEE, ChinaGrid).
  • P2P: millions of autonomous clients; no central control; logical overlay networks (structured/unstructured).
  • Clouds: virtualised clusters in data centres, offering elastic resources; distinctions with grids blurring (elastic vs static allocation).

Cluster Architecture & Design Issues

  • Nodes interconnected by low-latency/high-BW Myrinet, InfiniBand, or multi-level GbE; exposed as a single system via SSI middleware.
  • Critical features: high availability, hardware fault-tolerance, fast comms (MPI/PVM), cluster-wide job management, dynamic load-balancing, scalability.

Grid Computing Infrastructure

  • Grids = “utility power grid” analogy; couple computers, middleware, instruments, people across LAN/WAN/Internet.
  • Two families:
    • Computational/data grids (TeraGrid, ChinaGrid) – authenticated, centrally managed.
    • P2P grids – open, self-organising, unreliable resources.
  • OGSA & Globus Toolkit = de-facto standards; provide resource discovery, security (PKI, GSI), job submission.

Peer-to-Peer Networks: Structure & Applications

  • Physical layer: ad-hoc connections over IP; logical layer: overlay network of peer IDs.
  • Overlay types: unstructured (random graphs, flooding) vs structured (DHTs, deterministic routing).
  • Four application families:
    1. Distributed file sharing (Gnutella, BitTorrent)
    2. Collaborative platforms/IM (Skype, MSN)
    3. Volunteer computing (SETI@home ≈25 Tflops)
    4. P2P platforms (JXTA, .NET)
  • Challenges: heterogeneity, scalability, routing efficiency, reliability, trust, security, copyright.

Cloud Computing Landscape & Service Models

  • IBM definition: "A cloud is a pool of virtualised compute resources hosting varied workloads."
  • Service models:
    • IaaS (servers, storage, networking, VMs) – e.g., AWS EC2.
    • PaaS (middleware, runtime, DB, dev tools) – e.g., Google App Engine, Microsoft Azure.
    • SaaS (browser-delivered applications; ERP, CRM, email) – e.g., Salesforce, Google Workspace.
  • Deployment modes: private, public, managed (community), hybrid – each with distinct security/SLA splits.
  • Eight major adoption drivers: location flexibility, peak-load sharing, separation of infra mgmt, cost reduction, simpler programming, discovery + distribution, security/privacy frameworks, new biz/pricing models.

Software Environments & Service-Oriented Architecture

  • Layered stack (entity → communication → services) parallels OSI layers; technologies:
    • WSDL/SOAP/UDDI for web services
    • REST/HTTP/XML for lightweight alternatives
    • Message-oriented middleware (JMS, WebSphere MQ) underneath.
  • SOA evolution: sensors (SS) produce raw data → filter services (fs) → compute/storage/discovery clouds → portals/HUBs provide user access; goal: transform Data → Information → Knowledge → Wisdom → Decisions.

Distributed Operating Systems & Transparency

  • Three design approaches (Tanenbaum): Network OS, middleware extension (e.g., MOSIX), true distributed OS (e.g., Amoeba).
  • MOSIX2 overlays Linux, giving process migration, load balancing, partial SSI across clusters & clouds.
  • Transparent computing model separates user data, apps, OS, hardware; enables SaaS independence and VM portability.

Programming Models and Middleware

  • MPI: message-passing subroutines (C/FORTRAN); point-to-point + collective ops.
  • MapReduce: Mapkey,value\text{Map} \rightarrow \langle\text{key}, \text{value}\rangle generation, Reduce\text{Reduce} merges values per key; highly scalable for TB–PB data.
  • Hadoop: open-source MapReduce + HDFS storage; economic, efficient, reliable via replication.
  • OGSA: standardises grid services; Globus Toolkit implements security (GSI), resource allocation, job mgmt.

Performance Metrics, Scalability & Availability

  • Throughput (MIPS, Tflops, TPS), response time, latency, QoS, energy Gflops/W\text{Gflops}/\text{W}.
  • Scalability dimensions: size, software, application, technology.
  • OS-image vs scalability: SMP (1 image, ≤100s cores) < NUMA < Cluster/Cloud < Grid < P2P (millions of images).
  • Amdahl’s Law (fixed workload): S=1α+1αnS=\frac{1}{\alpha + \frac{1-\alpha}{n}}; efficiency E=1αn+(1α)E=\frac{1}{\alpha n + (1-\alpha)}.
  • Gustafson’s Law (scaled workload): S=α+(1α)nS' = \alpha + (1-\alpha)n; efficiency E=αn+(1α)E' = \frac{\alpha}{n} + (1-\alpha).
  • Availability: Availability=MTTFMTTF+MTTR\text{Availability}=\frac{\text{MTTF}}{\text{MTTF}+\text{MTTR}}; avoid single points of failure via redundancy, failover, clustering, VM migration.