3.2.6

Determining Parallelism in Loops

Basic Principles

  • Basic Rule: A loop can be executed in parallel if the result is the same as that of sequential execution.

  • Importance of Consistency: Must ensure that no execution with any number of threads produces incorrect results.

    • Avoiding Errors: A program that runs correctly 99.9% of the time can still fail occasionally, complicating debugging.

Extended Considerations

  • To address the concept of "always," envision executing with an infinite number of threads that can modify their speed or state at will.

  • Rules of Thumb for Determining Parallel Loops:

    • Certain patterns can help in quickly deciding whether the loop is parallelizable.

  • Upcoming Content:

    • Introduction to OpenMP constructs that enable parallelization of loops not deemed parallel initially.

    • Techniques for refactoring code.

Simple Rules for Parallelization

  • Conditions for Safe Parallelization: A loop can be parallelized if:

    • (A) The loop reads from some arrays and writes to others without any shared array.

    • (B) Each iteration targets a unique portion of an output array.

  • Example Code Snippets:

    #pragma omp parallel for
    for (i = 1; i < N; i++)  
        B[i] = A[i-1] + A[i+1];
    
    #pragma omp parallel for  
    for (i = 1; i < N; i++)  
        B[i] = x * A[i-1];
  • Note: If the criteria do not apply, parallel execution might still be feasible.

When a Loop is Not Parallelizable

  • Non-Parallel Conditions: A loop is not suitable for parallel execution if:

    • One iteration writes to a variable that the next iteration reads.

    • A subsequent iteration reads a location that an earlier iteration has written to.

    • A later iteration writes to a variable while an earlier iteration reads it.

  • Example Non-Parallel Code Snippet:

    #pragma omp parallel for  
    for (i = 1; i < N; i++)  
        B[i-1] = x * A[i] * B[i];

Application of Simple Rules

Parallelization Decisions

  • Assessing parallelizability:

    • Yes (for conditions A and B)

    • No (for variable writes followed by activity in later iterations)

  • Examples & Decisions:

    Loop Code

    Parallelizable

    for (i = 1; i < N; i++) B[i] += A[i];

    Yes

    for (i = 1; i < N; i++) B[0] += A[i-1];

    No

    for (i = 1; i < N; i++) A[N-i] += A[i-1];

    No

    for (i = 0; i < N; i++) A[i] += A[(i-N/2)%N];

    No

Conclusion

  • The examination of how loops can be parallelized involves an understanding of reading and writing dependencies, ensuring safety in execution, and recognizing patterns for optimization.