Operating Systems Lecture on Parallel and Distributed Systems
CSCI U511 - Operating Systems Lecture Notes
Review and Learning Outcomes
- Continued discussion on parallel and distributed systems
- Key topics:
- Parallel and Distributed Systems
- Importance of Parallel and Distributed Systems
- Multiprogramming and Parallel Computing
- Parallel Algorithm Design
- Performance Metrics
- Important dates
- Homework 6: Due April 1 on Blackboard
- Quiz 6: April 3
- Exam 2: April 8 (Sample to be posted soon)
- SE Reflection Report: Due March 28 at midnight
- Project updates will be discussed today.
- Importance of measuring performance in advance
- Common metrics for parallel systems:
- Execution time
- Speedup
- Efficiency
- Cost
- Scalability
Metrics: Execution Time
- Defined as:
- Serial execution time: TS
- Parallel execution time: TP
- Formula:
- TP = Tcomp + Tmem + Tcomm + Tidle + Textra
- Elapsed time is measured from the start of computation to the finish of the last processing element.
- Total parallel overhead calculated as:
- Total Parallel Overhead = pTP - TS
Communication Cost
- Calculated as:
- Tcomm = α + β · size
- α: per message cost
- β: per byte cost
- Important to optimize communication:
- Strategies include sending fewer messages and combining messages to reduce overhead.
Metrics: Speedup
- Defined as:
- Speedup = TS / TP
- Note: TS is the execution time of the best sequential algorithm.
- Super-linear speedup occurs when speedup > p, taking advantage of faster storage, e.g., cache/memory.
Metrics: Efficiency
- Represents the fraction of time for which a processing element is effectively utilized.
- Calculated as:
- E = Speedup / p
- E indicates the ratio between actual and ideal performance:
- E = Actual Performance / Ideal Performance
Scalability
- Importance of adapting parallel programs to growth in:
- Problem size
- Number of processing elements
- Scalability: The ability to maintain speedup as we increase the number of processing elements.
More Definitions
- Problem Size:
- Refers to the amount of computation necessary to solve the problem and is related to serial execution time, not merely data size.
- Scalable Parallel Systems:
- Capable of maintaining efficiency when increasing both problem size and the number of processing elements.
- Summary of common metrics for parallel systems:
- Execution time
- Speedup
- Efficiency
- Cost
- Scalability
Parallel Matrix Multiplication
- The task is to compute C = A * B where each is SZ by SZ or n by n.
- Basic algorithms require SZ^3 or n^3 operations.
- Key variables influencing performance:
- Data layout
- Topology of machines
- Scheduling communications
SUMMA Algorithm
- SUMMA: Scalable Universal Matrix Multiply Algorithm
- Designed to compute C = A · B with each matrix being n by n.
- Sequential calculations require n^3 multiplications.
- Algorithm using q^2 processors, where each processor calculates a block of size (m by m) with q iterations:
- q groups of broadcasts in parallel for matrix A and matrix B, followed by an m by m multiplication.
- Sequential execution time (Ts):
- Comprises only computation time, ignoring other factors.
- Ts = n^3 if a multiplication takes one unit time.
- Parallel execution time (Tp):
- The time taken is distributed across q iterations and includes communication cost:
- Ignoring memory time and assuming balanced workload:
- TP = n^3 * q^2 + q * 2 * n^2
- Analysis of Speedup and Efficiency to demonstrate scalability with respect to n and q.
Summary
- Finished coverage of Parallel and Distributed Computing.
- Next topic: Memory Management.