Lecture_18-CSCI_U511_01-Jahangir_Majumder-Spring_2025

Review and Learning Outcomes

  • Focus on parallel and distributed systems
  • Key topics:
    • Parallel and Distributed Systems
    • Motivation for using these systems
    • Understanding Multiprogramming and Parallel Computing
    • Design considerations for Parallel Algorithms
    • Performance evaluation techniques
  • Important Dates:
    • Quiz 5 grades available on Blackboard
    • Homework 6 due April 1
    • Quiz 6 on April 3
    • Exam 2 on April 8
    • SE reflection report due March 28
    • Project updates on March 27

Algorithm Basics

  • Algorithm: A sequence of steps to complete a task
  • Parallel Algorithm: Designed to execute efficiently on multiple processors, often developed from a serial algorithm.

Parallel Algorithm Design

Basic Steps

  1. Partitioning: Dividing the project into manageable pieces
  2. Communication: Establishing how workers communicate
  3. Agglomeration: Combining tasks if necessary
  4. Mapping: Deciding task allocation to processors

Types of Partitioning

  • Computation: Functional decomposition - breaking tasks into functions.
  • Data: Domain decomposition - dividing data into manageable segments.
  • Aim for:
    • Smaller data segments for flexibility
    • Avoiding unnecessary data replication

Domain Decomposition

  • Focus on slicing major data structures:
    • Multi-dimensional arrays
    • Trees or graphs
  • Aim to cut data into smaller pieces for efficient processing.

Functional Decomposition

  • Ideal for complex applications:
    • Modular code design that supports simultaneous tasks (e.g., rocket simulation tasks)

Partitioning Design Considerations

  • Is the number of partitions adequate?
  • Evaluate work distribution among processors;
  • Check for redundancies in computation and storage;
  • Ensure partitioning scales with problem size.

Communication in Parallel Systems

  • Tasks may require interdependencies:
    • Management of alternating compute and communication phases.
    • Important to minimize communication costs.

Asynchronous Communication

  • Communicates without strict synchronization, beneficial for:
    • Real-time applications, web services, and databases.
  • Careful design needed to manage data \
    locality and handle requests efficiently.

Agglomeration

  • Identify who does what within the project structure.
  • Move from abstract to concrete designs factoring in architecture and resources.

Increasing Granularity

  • Main objective: Improve performance by reducing communication overhead.
  • Combine partitions strategically to yield better efficiency.

Mapping in Parallel Systems

  • Mapping: Process of placing tasks on processors.
  • Considerations include:
    • Concurrency, Locality, and Communication Cost

Load Balancing

  • The slowest processor sets the performance benchmark.
  • Different methods to achieve load balance:
    • Static and Probabilistic methods (Dynamic).

Performance Categories

Flynn's Model of Computation

  • Computers categorized by instruction and data streams into four classes:
    • Single Instruction Single Data (SISD)
    • Single Instruction Multiple Data (SIMD)
    • Multiple Instruction Single Data (MISD)
    • Multiple Instruction Multiple Data (MIMD)

Importance of Performance Study

  • Motivation for parallelization: Reducing execution time, enhancing speedup, and improving scalability.
  • Optimization aims for:
    • Shortening execution time
    • Achieving speedup and scalability.

Amdahl's Law

  • Speedup does not follow a perfect curve in practice due to serial components limiting performance.

CPU Performance Metrics

  • CPU performance impacted by:
    • Total clock cycles
    • Clock rate
    • Instructions per cycle
  • Key metrics:
    • Execution time, Speedup, Efficiency, Cost, Scalability.

Scalability and Efficiency

  • Scalability: Ability to improve performance as the number of processors and problem sizes grow.
  • Efficiency: Ratio of actual performance to ideal performance.

Common Performance Metrics for Parallel Systems

  • Metrics include:
    • Execution Time: Measure time from start to finish.
    • Speedup: Ratio of serial execution to parallel execution time.
    • Efficiency: Measure effective utilization of processing units.
    • Scalability: Measure the ability to increase speedup with more processors.

Summary

  • Continued focus on Parallel and Distributed Computing.
  • Next topics to cover: Memory Management.