1/27
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Designing MPI Applications
Involves creating applications that use the Message Passing Interface (MPI) for communication between processes in a parallel computing environment, optimizing data exchange and synchronization.
When is an MPI program Safe
An MPI program is safe when there is no deadlocking or race conditions. It should also produce the same message regardless of the timing of execution, ensuring consistent results across different runs.
When do unsafe MPI programs arise
when any two programs exchange information/data
Consider this example:
say you have A and B and you call Send Recv for both:
A: MPI_Send(B)
A: MPI_Recv(B)
B: MPI_Send(A)
B: MPI_Recv(A)
You will experience deadlocking why? and what can you do to fix it
You will receive deadlocking based of the size of the the buff the data may have to be sent in multiplel parts causing the “mailboxes” to be filled not allowing to recv messages. To fix this, you can either swap the function call over have A Send and Recv and B Recv and Send your use the SendRecv operations to avoid deadlocks by combining send and receive operations in one call, ensuring that both processes can communicate without waiting on each other.
List all the Options to avoid deadlocking with MPI
Have even rank machines send and have odd rank machines Recv
Use the Asynchronous functions that MPI has I_Send or I_Recv
Use the collective and Safer function SendRecv
What is a common way to debug and find deadlocking Issues with MPI
Using MPISsend synchronous send will allow you to figure out where deadlocking occurs. Since synchronous send means that you have to wait for one machine to send the data fully before another machine if you find that your code stops execution/infinite loops you know that there is a case when two sends are being fired and nobody can grab it.
Remote Memory Access
A programming model that allows processes to directly access the memory of a remote machine, facilitating data sharing and communication in parallel computing.
Hybrid Parallelism
A combination of both shared and distributed memory programming models, allowing parallelism to exploit local and remote resources efficiently. For example, utilizing OPENMP and MPI but
what is fine-grained parallelism
a model where tasks are divided into very small units that can be executed concurrently, maximizing resource utilization and reducing idle time. (OpenMP)
what is coarse-grained parallelism
a model where tasks are divided into larger units of work, allowing better communication and synchronization while reducing overhead for task management compared to fine-grained parallelism. (MPI)
Why use collective operations in parallel computing
They are safer, they provide better efficiency, and overall better readability.
How does the MPI_Bcast function work under the hood
The Bcast function gives everyprocess in a communicator a copy of the data from one specified process, typically the root process, utilizing a tree or ring algorithm to manage data transfer efficiently.
How does the MPI_Scatter function work
The Scatter function assigns every process in a communicator a distinct portion of the data from one specified process, typically the root process, allowing for the parallel processing of different parts of the data.
If you use the Scatter method what are the 2 common methods to join the data back to the main process.
Gather and Reduce
What is a Cluster
A group of machines that a tightly coupled. Usually connected by a LAN(Local Area Network)
How are HPC Clusters Orientated
HPC Clusters are typically batch orientated and designed for high-throughput computing, allowing multiple jobs to run concurrently and efficiently manage resources.
What does a based HPC cluster consist of
a head node for job submission and cluster management
What is the Typical Workflow of an HPC
we have a user → that user then logins in → a job is sent by the head node → that is put in the queue → get scheduled by the head node.
What corporations use HPC clusters
Google Cloud, Amazon Web Services, MicroSoft Azure.
What is SLURM
Simple Linux Utility for Resource Management. Commonly used in HPC applications
What is Job Scheduling?
unit of work in a HPC Cluster
A job is set of tasks
A task is an executable program
Program yields on or more process-the atom of scheduling
What is Map-Reduce?
It is an approach for analyzing Big Data
What are the pros of Map-Reduce
massively parallel
fault tolerant
easy to program
What is the condition to use Map-Reduce
it must be perfectly parallel
Real world examples of Map-Reduce
Search Engines
Social Media
Fianacial
Health Industry
Insurance
Credit Card Companies
Typical Usages for Map-Reduce
Working with unrestricted data not stored in a Database
When you want to extract one feature
MaPle CMU Functional Programming
There has been research done in order to improve parallel computing using functional languages as it gets rid of pointer that cause issue with parallel computing
What does the Restrict Keyword do
used to support automatically parallelism and prevents overlapping by using buffering