Concise Notes on Parallel and Distributed Computing

  • Parallel Computing: Simultaneous use of multiple computational resources to solve a problem by breaking it into discrete parts.

    • Purpose: Save time/money, solve larger/complex problems, provide concurrency, better use of hardware.
  • Distributed Computing: Extension of parallel computing involving multiple independent computers.

    • Architecture: Client-server (centralized request-response) vs. Peer-to-peer (decentralized, direct sharing).
  • Types of Parallelism:

    • Data Parallelism: Same operation on multiple data pieces concurrently (e.g., image processing).
    • Task Parallelism: Different tasks executed simultaneously, can be dependent or independent.
  • Flynn's Taxonomy:

    • SISD: Single instruction, single data.
    • SIMD: Single instruction, multiple data.
    • MISD: Multiple instruction, single data.
    • MIMD: Multiple instruction, multiple data.
  • Amdahl’s Law: Determines potential speedup based on the fraction of a program that can be parallelized.

  • Gustafson’s Law: Considers scalability with increasing problem size.

  • Memory Architectures:

    • Shared Memory (SMP, NUMA): Multiple processors share memory, different access times in NUMA.
    • Distributed Memory: Each processor has independent memory requiring communication (message passing).
  • Parallel Programming Models:

    • OpenMP: Shared-memory, easy integration.
    • CUDA: GPU parallel processing.
    • MPI: Message-passing interface, suitable for communication between multiple processing units.
  • Fault Tolerance: Continues operation despite failures.

    • Strategies: Data replication, component redundancy, failover mechanisms.
  • Synchronization Techniques:

    • Time Synchronization: NTP, PTP for time accuracy.
    • Data Synchronization: Replication, consensus algorithms to ensure consistency.
  • Load Balancing: Distributing workloads evenly across resources to avoid bottlenecks:

    • Dynamic techniques like work sharing and scheduling algorithms.
  • Programming Complexity: Evaluating ease of programming, debugging, and communication overhead in parallel systems.