1/31
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Microprocessor shift to multicore systems
Increasing transistor speed caused higher power consumption and heat dissipation issues, making faster single-core systems unsustainable.
Serial programs and multicore systems
Serial programs are designed for single-core systems and lack the ability to divide tasks among multiple cores, requiring rewriting or conversion into parallel programs.
Task-parallelism
Different tasks are assigned to different cores.
Data-parallelism
Data is divided among cores, with each performing the same operations on its subset.
Shared-memory systems
All cores share access to a common memory.
Distributed-memory systems
Each core has its private memory, and communication is achieved via message passing.
Load balancing in parallel computing
Unequal task distribution leads to some cores finishing early while others remain active, wasting computational power. Load balancing ensures an even workload across all cores.
MPI
MPI (Message Passing Interface) is a library for enabling communication between processes in a distributed-memory system.
Key steps in writing an MPI program
Initialize the MPI environment using MPI_Init, perform communication using MPI functions like MPI_Send, MPI_Recv, MPI_Bcast, and others, and finalize the MPI environment with MPI_Finalize.
MPI_Comm_size
It determines the total number of processes in a communicator, typically MPI_COMM_WORLD.
Process identification in MPI
Processes are identified by their unique rank, which is an integer obtained using MPI_Comm_rank.
MPI_Reduce
MPI_Reduce combines data from all processes and returns the result to a single process.
MPI_Allreduce
MPI_Allreduce combines data and distributes the result to all processes.
Communicator in MPI
A communicator is a group of processes that can communicate with each other. The default communicator is MPI_COMM_WORLD.
MPI_Comm_rank in 'Hello, World' example
It retrieves the rank of the process, allowing the program to differentiate tasks based on the rank.
What happens if you forget to call MPI_Finalize at the end of an MPI program?
resource leaks, hanging processes running, non-compliant program
MPI_Send
Sends a message (greeting) from one process to another.
MPI_Recv
Receives the message sent by a process.
Trapezoidal rule program contribution
Each process computes a local integral over its subinterval, and the results are combined using MPI_Reduce.
MPI_Scatter
Distributes input data (e.g., intervals) to all processes.
MPI_Gather
Collects results from all processes to one.
Blocking behavior of MPI_Recv
The program will hang because MPI_Recv blocks until a matching message is received.
MPI_Bcast
It broadcasts input values (e.g., a, b, n) from one process (typically rank 0) to all other processes, eliminating the need for multiple sends.
Non-blocking functions
Return immediately, allowing computation to proceed while communication is in progress.
Blocking functions
Wait until the communication is complete before proceeding.
MPI_Barrier
Used when you need all processes to synchronize and reach the same point in the program before continuing.
Derived datatypes
Created using MPI functions like MPI_Type_contiguous or MPI_Type_create_struct to send and receive complex structures in a single call.
MPI_Wtime
Measures elapsed wall-clock time between points in the program, helping evaluate the efficiency of parallel execution.
Tag mismatch in MPI
The receive operation may not match the intended message, causing unexpected behavior or a hang.
Load balancing in parallel programs
Imbalanced workloads result in idle processes waiting for others to finish, wasting computational resources.
Risk of overusing MPI_Barrier
Excessive synchronization can lead to performance degradation by forcing processes to wait unnecessarily.
Avoiding deadlock in point-to-point communication
Use careful matching of MPI_Send and MPI_Recv, and consider non-blocking communication to avoid circular dependencies.