Distributed Systems Course Notes
Course Information
Course Title: Distributed System
Course Code: PCC-ADS-308G
Type: Professional Core Course
Credits: 3
Semester: 6th
Evaluation
Class Work: 25 Marks
Exam: 75 Marks
Total: 100 Marks
Exam Duration: 3 Hours
Exam Structure
Total Questions: 9
Question 1: Compulsory, 8 short-answer questions, 20 marks.
Units I-IV: Two 20-mark questions from each unit.
Attempt: Question 1 + one question from each unit (total 5 questions).
Course Outcomes
Analyze popular distributed systems (e.g., P2P).
Understand Shared Memory Techniques.
Knowledge of file access.
Knowledge of Synchronization and Deadlock.
UNIT-I: Introduction to Distributed Systems
Introduction
A distributed system comprises independent computers that appear as a single coherent system to its users. The nodes collaborate, communicate via a network, and coordinate activities to achieve a common goal by sharing resources, data, and tasks.
Centralized vs. Distributed Systems
Feature | Centralized System | Distributed System |
|---|---|---|
Data & Resources | Single central place (server) | Dispersed over several servers/locations |
Maintenance | Easy | More challenging due to numerous interaction points |
Scalability | Limited; potential bottleneck if the server malfunctions | Better scalability; system can function even with component failure |
Reliability | Single point of failure | Higher reliability due to redundancy |
Security | Easier to secure | More difficult to secure |
Distributed Computing Definition
A system where software components are spread across different computers but operate as a single entity. It can involve various configurations like mainframes, computers, workstations, and minicomputers.
Cloud Computing Principles
Sharing of resources (hardware, software, data) with varying levels of openness and concurrency. It facilitates simultaneous data processing through multiple processors and offers improved fault tolerance, enabling quicker recovery from system failures.
Advantages of Distributed Systems
Organizations use distributed computing to manage data generation and meet increasing application performance demands, providing easier scalability.
Data Volume Growth: Simpler to add hardware to a distributed system compared to upgrading/replacing a centralized system.
System Types:
Cohesive system: Each machine is owned by the customer; results are routed from one source.
System with end-users: Each node has an end-user; the distributed system helps in sharing resources/communication.
Benefits of a Multi-Computer Model
Improved Scalability: Scale-out architecture for easy hardware addition as load increases.
Enhanced Performance: Parallelism for the divide-and-conquer approach.
Cost-Effectiveness: Minimizes latency, enhances response time and throughput, and uses low-cost commodity hardware for ensure zero data loss.
Architecture of Distributed Systems
Cloud-based software, forms the backbone, accessible through the internet. Components and connectors are arranged for easy communication. Components are replaceable/reusable modules with well-defined interfaces. Connectors are communication links mediating coordination between components.
Software Architecture
Logical organization of software components and their interactions.
Layered Architecture: Modular approach with components separated into units, improving efficiency. (e.g., OSI model).
Object-Based Architecture: Loosely coupled objects interacting through interfaces (connectors). Interactions occur through direct method calls, such as Remote Procedure Calls (RPC) using Java RMI, Web Services, and REST API Calls.
Data-Centered Architecture: Employs a central data repository (active or passive). Producers (businesses) provide data to the common data store, from which consumers (individuals) request data; utilizes persistent storage such as SQL databases.
Event-Based Architecture: Communication is managed entirely through events. Components are loosely coupled, which allows for easy modification and uses publisher-subscriber systems and enterprise service buses.
System Architecture
System-level architecture focuses on the entire system and placement of components across multiple machines. Key architectures are:
Client-Server Architecture: A server handles all work processes, while clients interact with services and other resources on the remote server. Centralized security is typical, with usernames/passwords stored in a secure database, ensuring stability.
Peer-to-Peer (P2P) Architecture: Operates without central control; nodes function as both clients and servers. Nodes can register with a centralized lookup server or broadcast service requests to other nodes. P2P networks comprise structured (predefined data structure), unstructured (random neighbor selection), and hybrid P2P (orderly assignment of unique functions) sections.
Key Components of a Distributed System
Primary System Controller: Manages server requests and installs executive and mailbox services.
Secondary Controller: Regulates server processing requests and translation load.
User-Interface Client: Offers system information (not in clustered environments).
System Datastore: Sole data storage for shared data, typically on a disk vault.
Database: A relational database storing all data, shared among multiple users.
Examples of Distributed Systems
Networks: Ethernets and LANs, peer-to-peer networks, e-mail, and the internet.
Telecommunication Networks: Telephone and cellular networks and VoIP systems.
Real-Time Systems: Used across airline, ride-sharing, logistics, financial trading, MMOGs, and e-commerce industries emphasizing prompt data conveyance.
Parallel Processors: Splits tasks among multiple processors.
Distributed Database Systems: Databases spread across multiple servers/regions.
Distributed Artificial Intelligence: Approaches include complex learning algorithms and decision-making, requiring large computational data sets.
Real-World Examples
Video-rendering systems
Scientific computing
Airline and hotel reservation
Cryptocurrency processors
P2P file-sharing
Multiplayer video games
E-learning applications
Distributed supply chains like Amazon
Characteristics of Distributed Systems
Resource Sharing: Ability to use hardware, software, or data anywhere in the system.
Openness: Extensibility and improvements in the system.
Concurrency: Naturally present in distributed systems, managed by separate users in remote locations.
Scalability: Ability to increase the system's scale to accommodate more users and improve responsiveness.
Fault Tolerance: Reliability even with hardware/software failures.
Transparency: Hiding complexity from users and applications.
Advantages of Distributed Systems
Scalability: Easy growth via adding more nodes.
Reliability & Fault Tolerance: Service continuity in case of failure.
Performance: Workloads spread across multiple nodes.
Resource Sharing: Increased efficiency and reduced costs.
Geographical Distribution: Global service delivery.
Disadvantages of Distributed Systems
Lack of relevant software.
Security vulnerabilities due to easy data access.
Network saturation may cause data transfer issues.
Complex database management.
Potential network overload.
Goals of Distributed Systems
Scalability: Efficient scaling as demand increases (horizontal or vertical).
Fault Tolerance: Continuous functioning even with component failures (replication, redundancy, checkpointing).
Availability: High uptime and accessibility, handled through replication.
Consistency: Ensuring agreement on system state across all nodes.
Concurrency: Allowing multiple tasks to be processed concurrently.
Transparency: Hiding complexity from users (location, access, replication transparency).
Security: Protecting against unauthorized access and maintaining data integrity.
Efficiency: Efficient resource management, minimal delays.
Hardware and Software Concepts
Hardware Concepts
Nodes: Individual computing machines.
Communication Network: LAN, WAN, Internet.
Storage Systems: NAS, DFS, cloud storage systems.
Processing Units: CPUs, GPUs, accelerators.
Synchronization Hardware: Clock synchronization mechanisms.
Fault Detection and Recovery Hardware: Redundant components and monitoring.
Software Concepts
Distributed Operating System (DOS): Manages resources across machines.
Middleware: Software between OS and application layers for communication, data management, and security.
Communication Protocols: HTTP/HTTPS, RPC, Message Queueing (AMQP), RESTful APIs, gRPC.
Distributed File Systems: HDFS, GFS, Ceph.
Concurrency and Synchronization: Locks, semaphores, distributed algorithms.
Distributed Algorithms: Leader election, consistency protocols.
Data Replication and Consistency: Handling replication, conflict resolution, and consistency models.
Security Mechanisms: Encryption, authentication, authorization, secure communication protocols.
Virtualization and Containerization: Docker, Kubernetes, VMware.
Design Issues in Distributed Systems
Scalability, Reliability, Availability, Consistency, Latency, Load Balancing, Security
Architectural Design Patterns, Communication Issues, Data Management
Detailed Breakdown of Key Design Issues
Scalability
Handling increased load without performance degradation.
Ensuring performance across geographically dispersed locations.
Strategies: Horizontal scaling, vertical scaling, sharding.
Reliability
Fault tolerance using duplicate components for failover.
Automated switching to standby systems (Failover Mechanisms).
Data Replication: Storing data copies on multiple nodes.
Consensus Algorithms: Ensuring consistency among replicated data (e.g., Paxos, Raft).
Availability
Minimize downtime via High Availability Architectures.
Monitoring and alerting.
Load Balancers: Distributing requests across multiple servers.
Geographic Redundancy: Deploying servers in different geographic locations.
Consistency
Different Data Consistency Models, strong and eventual consistency.
Understanding trade-off between consistency, availability, and partition tolerance.
Latency
Sources: Network & Processing Delays.
Minimization Techniques: Caching, data compression.
Load Balancing
Load Distribution Methods: Round Robin, Least Connections.
Dynamic vs. Static Approaches.
Security
Authentication: Identity Verification and Access control.
Encryption: Data encryption and HTTPS, SSL/TLS for secure communications.
Architectural Design Patterns
Client-Server: Centralized Servers.
Peer-to-Peer: Decentralized Network.
Microservices: Breaking complex applications to simpler, independent services.
Communication Issues
TCP/IP versus UDP,
Message Passing vs. Shared Memory
Synchronous vs. Asynchronous Communication
Data Management
Horizontal and Vertical Partitioning of Data
Master-Slave, Multi-Master Replication
Handling Distributed Transactions, like Two-Phase Commit Blockchain.
Communication in Distributed Systems
Communication in distributed systems is based on message passing due to the absence of shared memory. Processes, such as process A communicating with process B, exchange messages, which requires standardization and protocol to prevent errors.
ISO OSI Model
To simplify the complexity of communication, the International Standards Organization (ISO) developed the Open Systems Interconnection Reference Model (ISO OSI), which identifies various communication levels and assigns standard names.
Layers and Protocols
Layered protocols manage communication issues; connection-oriented protocols establish a connection before data exchange, while connectionless protocols transmit messages without prior setup. The OSI model divides communication into seven layers:
Physical: Transmits bits.
Data Link: Detects and corrects transmission errors.
Network: Routes messages.
Transport: Provides reliable communication.
Session: Manages dialogues and provides synchronization.
Presentation: Interprets data.
Application: Offers miscellaneous protocols for common activities.
How Nodes Communicate in Distributed Systems
Nodes communicate through message passing, remote procedure calls, shared memory, or sockets—enabling data exchange and coordinated action, thus fostering efficient system collaboration.
Communication Models
Model | Description | Use Cases |
|---|---|---|
Message Passing | Nodes communicate by sending messages over a channel, which can direct (point-to-point) or indirect (via brokers). Messages synchronous or asynchronous. | Loosely coupled systems over networks. |
Remote Procedure Call | Allows one program to execute code remotely as if it's local, providing familiar programming interface. | Client-server architectures, but faces latency and reliability issues. |
Publish-Subscribe | Decouples message publishers from subscriber allowing asynchronous, event-driven communication suitable for scaling distributed systems | Messaging systems, IoT platforms, event-driven architectures. |
Socket Programming | A low-level interface facilitating bidirectional communications between processes running on different hosts over the network. | Networked applications and distributed systems, for building networked apps. |
Shared Memory | Multiple processes or threads share an address space, enabling communication by reading/writing to shared memory but requires synchronization | Tightly coupled systems on multicore processors where issues such as data races and consistency are common. |
Communication Protocols
TCP: Transmission Control Protocol is a reliable, connection-oriented protocol for data integrity and sequencing.
UDP: User Datagram Protocol is a lightweight, connectionless protocol for low-latency data delivery.
HTTP: Hypertext Transfer Protocol is used for transferring hypertext documents on the web.
SMTP: Simple Mail Transfer Protocol sends and receives email messages used for secure email communication.
FTP: File Transfer Protocol manages file transfers.
RPC: Remote Procedure Call allows programs to execute procedures or functions on remote servers transparently.
Distributed System Architecture in ATMs
Adopting distributed systems in Automated Teller Machines (ATMs) provides reliability, scalability, and efficiency. It helps achieve:
Fault Tolerance and Redundancy
Improved Scalability
Enhanced Availability
Decentralized Processing
Adaptability to Network Variability
Consistency and Data Integrity
Enhanced Security Measures
Group Communication in Distributed Systems
Efficient group communication is crucial for coordinating activities among multiple nodes or entities. This involves reliable communication such as exchanging data to ensure all participants are informed.
Communication Types in Distributed Systems
Unicast: One sender to a specific recipient
Multicast: One sender to multiple receivers
Broadcast: One sender to all nodes in the network
Reliable Multicast Protocols for Group Communication
FIFO Ordering: Messages delivered in the order they were sent.
Causal Ordering: Preserves causal relationships between messages.
Total Order and Atomicity: Ensures all members receive messages in the same order.
Scalability for Group Communication
Scalability in group communication manages increasing nodes, messages, and participants efficiently using -
Partitioning and Sharding, Load Balancing, Replication and Caching.
Key Challenges of Group Communication in Distributed Systems
Maintaining system reliability, scalability, concurrency, consistency and fault tolerance.
Remote Procedure Call (RPC) in Distributed Systems
RPC is a protocol used to execute a procedure on a remote server as if it were a local call. RPC abstracts network communication and thus enables interactions between remote nodes. Key concepts include:
Simplified Communication via abstraction
Enhanced Modularity and Service Reusability
Facilitates Distributed Computing, using Inters Process Communication and Resource Sharing
RPC Architecture Overview
The RPC is built with clients on the server, client and server stubs, the process marshalling and unmarshalling data across communication layer and an RPC protocol
Types of Remote Procedure (RPC) in Distributed Systems
Synchronous: Client waits for server response.
Asynchronous: Client doesn't wait; handles responses later.
One-Way RPC: Client sends request without expecting a reply. Used when return Value or acknowledgements from service do not matter at the same stage.
Callback RPC: Server calls back the client with the result.
Batch RPC: Client sends multiple requests in a batch.
Middleware in Distributed Systems
Middleware is software that facilitates communication, data management and integrates diverse services. it facilitates-
Communication, Seamless Interactions.
supports Integrations , Service and Application
Enhances Scalability through Load Balancing.
Ensures Data Consistency for integrity
Major Types of Middleware
Communication Middleware: Provides features for processes communications.
Database Middleware: provides connectivity layers that manages transaction and connections.
Transaction Middleware: Manages distributed transactions. And guarantees Atomicity.
Application Middleware: Implements ESB that integrates varied applications and services
Benefits of Middleware
Enhanced Communication and Integration
Improved Scalability and Flexibility
Transaction Management and increased Atomicity
Challenges of Middleware
Overheads from Process and Communication with Additional complexity,
Difficult Integrated and potential Vulnerability to Security issues
Distributed Operating System (DOS)
It is a type of the OS that uses several center processors to serve for multiple- time real operations and users. It’s made of joined nodes with a LAN or WAN lines. Provides virtual machine Abstraction and share I/O devices with computing resources.
Types of DOS
Client-Server, Peer- To- Peer where duties divided in evenly manner, Middleware for Interoperability among different OS and multi- tiered Systems
features of DOS
Include qualities such as openness, scalability, distribution resource, sharing, flexibility, transparency
heterogeneity, fault tolerance.
Application of DOS
Are varied from network programs, parallel computation, Telecommunication to real- time- process and sensor network
Distributed Operating Systems: Advantages and Disadvantages
Advantages
Resource redistribution, data availability, system operations independent,
increased data exchanges
Disadvantage
limitations with schedulers or data access,
hard implemented securities with higher expectancies
UNIT-II: Synchronization in Distributed Systems
Clock Synchronization
Clock synchronization is process of keeping the distributed computer clock same . Important for Data Integraty, sequencing, fault detection, security, asynchronization . UTC standard with physical clocks along with logical clocks used
Types of Clock Synchronisation
Physical clock
logical clock
mutual exclusion coordination across nodes.
Techniques of Clock Synchronization
include Network Time Protocol (NTP) protocol) ,Precision Time Protocol (PTP) and Berkley with Decentralized algorithm for managing drift
Practical uses in Real life involve financial transaction. Cloud computing databases where coordinated operations ensure integrity.
Challenges
Include Dispersion, local- decision, Preventing failure while working with the time discrepancies and uncertainty
Multilateral Exclusive - Mutex
A concurrence restrainer used to stop race limitations which states that a process cannot enter its essential area the same time with another process
Algorithm requirements include non- starvation . freedom Fairness and fault tolerance . Message passing implements it with token non token and Quorum Based ways
Election Algorithms
It are protocols to elect 1 process among a gaggle of processors to act as a Coordinator. Algorithm Bully Algorithm : The Bully computation applies to system while calculation of algorithm on ringed configuration
Atomic Transactions
They is a process to ensure change across a variety of sides with coordination, properties to keep transaction constant like Atomicity to ensure, Consistency, which is run concurrently , Isolation which Serialize access together with maintaining with durability properties. There is coordination in a distributed transaction with manager that commits Rollback needs
Phases of atomic commitment
Consist of getting ready, to inform or cancer actions with one segment dedication. Two segment Comitted with all and any section . the result and making secure integrity is kept and acknowledged from employees to the head
Unit Deadlocks in Distributed Systems
With a set of processes locked due to each process securing an investment even although awaiting for sources. It makes aid sharing harder and comes with aid and conversation deadlock varieties
Detection processes in techniques
Include centralized and distributed graph wait strategies . with the banker or aid of resource allocation images. Its all about taking steps to lessen the Circular wait