3.2.6
Determining Parallelism in Loops
Basic Principles
Basic Rule: A loop can be executed in parallel if the result is the same as that of sequential execution.
Importance of Consistency: Must ensure that no execution with any number of threads produces incorrect results.
Avoiding Errors: A program that runs correctly 99.9% of the time can still fail occasionally, complicating debugging.
Extended Considerations
To address the concept of "always," envision executing with an infinite number of threads that can modify their speed or state at will.
Rules of Thumb for Determining Parallel Loops:
Certain patterns can help in quickly deciding whether the loop is parallelizable.
Upcoming Content:
Introduction to OpenMP constructs that enable parallelization of loops not deemed parallel initially.
Techniques for refactoring code.
Simple Rules for Parallelization
Conditions for Safe Parallelization: A loop can be parallelized if:
(A) The loop reads from some arrays and writes to others without any shared array.
(B) Each iteration targets a unique portion of an output array.
Example Code Snippets:
#pragma omp parallel for for (i = 1; i < N; i++) B[i] = A[i-1] + A[i+1]; #pragma omp parallel for for (i = 1; i < N; i++) B[i] = x * A[i-1];Note: If the criteria do not apply, parallel execution might still be feasible.
When a Loop is Not Parallelizable
Non-Parallel Conditions: A loop is not suitable for parallel execution if:
One iteration writes to a variable that the next iteration reads.
A subsequent iteration reads a location that an earlier iteration has written to.
A later iteration writes to a variable while an earlier iteration reads it.
Example Non-Parallel Code Snippet:
#pragma omp parallel for for (i = 1; i < N; i++) B[i-1] = x * A[i] * B[i];
Application of Simple Rules
Parallelization Decisions
Assessing parallelizability:
Yes (for conditions A and B)
No (for variable writes followed by activity in later iterations)
Examples & Decisions:
Loop Code
Parallelizable
for (i = 1; i < N; i++) B[i] += A[i];Yes
for (i = 1; i < N; i++) B[0] += A[i-1];No
for (i = 1; i < N; i++) A[N-i] += A[i-1];No
for (i = 0; i < N; i++) A[i] += A[(i-N/2)%N];No
Conclusion
The examination of how loops can be parallelized involves an understanding of reading and writing dependencies, ensuring safety in execution, and recognizing patterns for optimization.