3.2.4-OpenMP---History-and-Parallel-Loops
History of OpenMP
Development: Automatic parallelization research began at the University of Illinois under David Kuck, producing insights but facing implementation challenges.
Early 1980s: Introduction of shared memory multiprocessors.
Parallel Computing Forum: Formed by Kuck to develop Fortran extensions for parallel programming constructs.
1997: Establishment of the OpenMP standard, which standardized shared-address-space programming practices over 15 years.
OpenMP Philosophy
Utilization of Pragmas: OpenMP uses pragmas as compiler hints.
Example syntax:
#pragma omp <directive>.Ignoring pragmas results in correct sequential execution, indicating they guide compilers without strict enforcement.
Emphasizes programmer trust while the compiler automates execution.
OpenMP Directives and Examples
Requirements: Include
#include <omp.h>; compile with OpenMP options (-fopenmpfor GCC,-openmpfor Intel compilers).Proper vs Improper Parallel Loops:
Valid:
#pragma omp parallel for for (i = 0; i < N; i++) A[i] = i * i * 0.23;Invalid:
#pragma omp parallel for for (i = 1; i < N; i++) A[i] = A[i-1] + A[i+1;}due to data dependencies.
Criteria for Loop Parallelization
A loop can be parallelized if its parallel execution yields the same result as sequential execution (small floating point discrepancies are acceptable).
Floating Point Representation and Issues
Floating point handles wide dynamic ranges using fixed bits, expressed as
Value = 1.mantissa x 2^(exponent-127).Issues: Precision loss when adding numbers in different orders; non-associativity complicates equality comparisons.
Use thresholds for proximity comparisons, e.g.,
if abs(x-y) < THRESHOLD.
Simple Rules for Parallelization
A loop can be parallelized if:
It reads from certain arrays while writing to others without overlap.
Each iteration modifies a distinct array segment.
Examples:
#pragma omp parallel for for (i = 1; i < N; i++) B[i] = A[i-1] + A[i+1];#pragma omp parallel for for (i = 1; i < N; i++) B[i] = x * A[i-1];
Identifying Non-parallelizable Loops
A loop is non-parallelizable if one iteration writes to a variable that another reads in subsequent iterations.
Example:
#pragma omp parallel for for (i = 1; i < N; i++) B[i-1] = x * A[i] * B[i];
Analysis of Loop Parallelization
Some loops can be parallelized while others cannot due to dependencies; careful restructuring is needed to optimize performance and enable parallel execution.