Interactions determined by data flow between concurrent activities.
Data flow is not necessarily one-way or linear.
Note
The three ways of organizing the parallel algorithm are alternatives.
Unlike finding concurrency, the programmer chooses one of the three alternatives and exploits one of the parallel design patterns in the group.
Supporting Structures
Investigates structures/patterns to support the implementation of algorithms planned in the algorithm structure design space.
Program Structures
Single Program, Multiple Data (SPMD)
All processing elements (PEs) run the same program in parallel, each with its own data subset.
Master/Worker
A Master task sets up concurrent Worker threads/processes and a bag of tasks.
Loop Parallelism
Executes compute-intensive loop iterations in parallel.
Fork/Join
A thread/process forks sub-processes that execute in parallel.
Data Structures
Shared Data
Manages data shared among concurrent activities.
Shared Queue
Creates queue data types accessible concurrently.
Supports interaction of concurrent activities in different contexts.
Distributed Array
Manages arrays partitioned and distributed among concurrent activities.
Implementation Mechanism
Meta-patterns representing base mechanisms to support parallel computing abstractions.
Concurrent activities
Synchronization
Communication
UE Management
Manages units of execution, including creation, destruction, and management of concurrent activities.
Synchronization
Handles ordering of events/computations in the UEs used to execute the parallel application.
Communication
Manages communication between UEs, including data exchange among concurrent activities.
List of Parallel Patterns
Embarrassingly Parallel
A class of problems where work division into independent tasks is obvious and simple.
Examples: Trapezoid rule program, loop accumulation with multiplication or addition.
Master/Worker
A Master task sets up concurrent Worker threads/processes and a bag of tasks.
Workers take tasks from the bag and execute them in parallel until the bag is empty or another ending condition occurs.
Map and Reduce
Simplest pattern.
Map
Applies a function to every data element in parallel.
Functions must have no side effects and must be identical and independent.
MapReduce
Solves problems involving input, processing, and generation of large datasets, scalable across many processors.
Mapping
Divides a large dataset into N discrete subsets, each processed on a single processor; output is typically a map data structure
Shuffle
Extracts similar pairs and assigns them to a new processor where the reduce operation will happen.
Reduce
Combines elements in an input dataset into a single result.
Divide & Conquer
Smaller versions of the problem are solved independently.
The larger solution depends on the smaller solutions.
Can be broken down into smaller sub-problems, each solved independently.
Partial solutions are combined into a solution for the larger problem.
Fork/Join
The number of parallel threads varies as the program executes.
Forks off a different number of threads at different times and waits for them to finish before proceeding.
The originating process waits until the child processes all join before resuming its own execution.
Has the possibility of having nested parallel execution.
A Last Word on Parallel Design Patterns
Parallel design patterns can be very similar because they typically follow the process of creating a parallel algorithm/program from a corresponding serial algorithm/program.
The process is to:
Identify concurrency.
Split the program into concurrent pieces.
Split the data if the updated algorithm calls for it.
Execute the concurrent pieces in parallel.
Put all the answers back together to make a final answer.