JP

Operating Systems Lecture on Parallel and Distributed Systems

CSCI U511 - Operating Systems Lecture Notes

Review and Learning Outcomes

  • Continued discussion on parallel and distributed systems
  • Key topics:
    • Parallel and Distributed Systems
    • Importance of Parallel and Distributed Systems
    • Multiprogramming and Parallel Computing
    • Parallel Algorithm Design
    • Performance Metrics
  • Important dates
    • Homework 6: Due April 1 on Blackboard
    • Quiz 6: April 3
    • Exam 2: April 8 (Sample to be posted soon)
    • SE Reflection Report: Due March 28 at midnight
  • Project updates will be discussed today.

Performance Metrics

  • Importance of measuring performance in advance
  • Common metrics for parallel systems:
    • Execution time
    • Speedup
    • Efficiency
    • Cost
    • Scalability

Metrics: Execution Time

  • Defined as:
    • Serial execution time: TS
    • Parallel execution time: TP
    • Formula:
      • TP = Tcomp + Tmem + Tcomm + Tidle + Textra
  • Elapsed time is measured from the start of computation to the finish of the last processing element.
  • Total parallel overhead calculated as:
    • Total Parallel Overhead = pTP - TS

Communication Cost

  • Calculated as:
    • Tcomm = α + β · size
      • α: per message cost
      • β: per byte cost
  • Important to optimize communication:
    • Strategies include sending fewer messages and combining messages to reduce overhead.

Metrics: Speedup

  • Defined as:
    • Speedup = TS / TP
      • Note: TS is the execution time of the best sequential algorithm.
  • Super-linear speedup occurs when speedup > p, taking advantage of faster storage, e.g., cache/memory.

Metrics: Efficiency

  • Represents the fraction of time for which a processing element is effectively utilized.
  • Calculated as:
    • E = Speedup / p
      • E indicates the ratio between actual and ideal performance:
      • E = Actual Performance / Ideal Performance

Scalability

  • Importance of adapting parallel programs to growth in:
    • Problem size
    • Number of processing elements
  • Scalability: The ability to maintain speedup as we increase the number of processing elements.

More Definitions

  • Problem Size:
    • Refers to the amount of computation necessary to solve the problem and is related to serial execution time, not merely data size.
  • Scalable Parallel Systems:
    • Capable of maintaining efficiency when increasing both problem size and the number of processing elements.

Review: Performance Metrics

  • Summary of common metrics for parallel systems:
    • Execution time
    • Speedup
    • Efficiency
    • Cost
    • Scalability

Parallel Matrix Multiplication

  • The task is to compute C = A * B where each is SZ by SZ or n by n.
  • Basic algorithms require SZ^3 or n^3 operations.
  • Key variables influencing performance:
    • Data layout
    • Topology of machines
    • Scheduling communications

SUMMA Algorithm

  • SUMMA: Scalable Universal Matrix Multiply Algorithm
  • Designed to compute C = A · B with each matrix being n by n.
  • Sequential calculations require n^3 multiplications.
  • Algorithm using q^2 processors, where each processor calculates a block of size (m by m) with q iterations:
    • q groups of broadcasts in parallel for matrix A and matrix B, followed by an m by m multiplication.

Performance of SUMMA

  • Sequential execution time (Ts):
    • Comprises only computation time, ignoring other factors.
    • Ts = n^3 if a multiplication takes one unit time.
  • Parallel execution time (Tp):
    • The time taken is distributed across q iterations and includes communication cost:
    • Ignoring memory time and assuming balanced workload:
    • TP = n^3 * q^2 + q * 2 * n^2
  • Analysis of Speedup and Efficiency to demonstrate scalability with respect to n and q.

Summary

  • Finished coverage of Parallel and Distributed Computing.
  • Next topic: Memory Management.