Lecture_18-CSCI_U511_01-Jahangir_Majumder-Spring_2025

Review and Learning Outcomes

Focus on parallel and distributed systems
Key topics:
- Parallel and Distributed Systems
- Motivation for using these systems
- Understanding Multiprogramming and Parallel Computing
- Design considerations for Parallel Algorithms
- Performance evaluation techniques
Important Dates:
- Quiz 5 grades available on Blackboard
- Homework 6 due April 1
- Quiz 6 on April 3
- Exam 2 on April 8
- SE reflection report due March 28
- Project updates on March 27

Algorithm Basics

Algorithm: A sequence of steps to complete a task
Parallel Algorithm: Designed to execute efficiently on multiple processors, often developed from a serial algorithm.

Parallel Algorithm Design

Basic Steps

Partitioning: Dividing the project into manageable pieces
Communication: Establishing how workers communicate
Agglomeration: Combining tasks if necessary
Mapping: Deciding task allocation to processors

Types of Partitioning

Computation: Functional decomposition - breaking tasks into functions.
Data: Domain decomposition - dividing data into manageable segments.
Aim for:
- Smaller data segments for flexibility
- Avoiding unnecessary data replication

Domain Decomposition

Focus on slicing major data structures:
- Multi-dimensional arrays
- Trees or graphs
Aim to cut data into smaller pieces for efficient processing.

Functional Decomposition

Ideal for complex applications:
- Modular code design that supports simultaneous tasks (e.g., rocket simulation tasks)

Partitioning Design Considerations

Is the number of partitions adequate?
Evaluate work distribution among processors;
Check for redundancies in computation and storage;
Ensure partitioning scales with problem size.

Communication in Parallel Systems

Tasks may require interdependencies:
- Management of alternating compute and communication phases.
- Important to minimize communication costs.

Asynchronous Communication

Communicates without strict synchronization, beneficial for:
- Real-time applications, web services, and databases.
Careful design needed to manage data \
locality and handle requests efficiently.

Agglomeration

Identify who does what within the project structure.
Move from abstract to concrete designs factoring in architecture and resources.

Increasing Granularity

Main objective: Improve performance by reducing communication overhead.
Combine partitions strategically to yield better efficiency.

Mapping in Parallel Systems

Mapping: Process of placing tasks on processors.
Considerations include:
- Concurrency, Locality, and Communication Cost

Load Balancing

The slowest processor sets the performance benchmark.
Different methods to achieve load balance:
- Static and Probabilistic methods (Dynamic).

Performance Categories

Flynn's Model of Computation

Computers categorized by instruction and data streams into four classes:
- Single Instruction Single Data (SISD)
- Single Instruction Multiple Data (SIMD)
- Multiple Instruction Single Data (MISD)
- Multiple Instruction Multiple Data (MIMD)

Importance of Performance Study

Motivation for parallelization: Reducing execution time, enhancing speedup, and improving scalability.
Optimization aims for:
- Shortening execution time
- Achieving speedup and scalability.

Amdahl's Law

Speedup does not follow a perfect curve in practice due to serial components limiting performance.

CPU Performance Metrics

CPU performance impacted by:
- Total clock cycles
- Clock rate
- Instructions per cycle
Key metrics:
- Execution time, Speedup, Efficiency, Cost, Scalability.

Scalability and Efficiency

Scalability: Ability to improve performance as the number of processors and problem sizes grow.
Efficiency: Ratio of actual performance to ideal performance.

Common Performance Metrics for Parallel Systems

Metrics include:
- Execution Time: Measure time from start to finish.
- Speedup: Ratio of serial execution to parallel execution time.
- Efficiency: Measure effective utilization of processing units.
- Scalability: Measure the ability to increase speedup with more processors.

Summary

Continued focus on Parallel and Distributed Computing.
Next topics to cover: Memory Management.