1/30
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
A program has 80% of its code parallelizable and 20% strictly serial. What is the maximum speedup possible with infinite processors according to Amdahl's Law?
5x maximum speedup because S = 1 / (0.2 + 0/∞) = 1 / 0.2 = 5.
A program runs in 100 seconds serially. With 4 processors it runs in 30 seconds. What is the speedup and efficiency?
Speedup = 100/30 ≈ 3.33. Efficiency = 3.33/4 ≈ 0.83 (83%).
A parallel program shows very little improvement when adding more processors. What is the most likely theoretical explanation?
Amdahl's Law: the serial portion of the program limits total possible speedup.
You distribute a large array across 8 processors and each processor calculates the sum of its section before combining results. What type of parallelism is this?
Data parallelism.
Each processor performs a different stage of a computation pipeline (input, processing, output). What type of parallelism is this?
Task parallelism.
Why can a shared-memory system suffer from contention when many processors access the same variable?
Multiple processors attempt to access the same memory location simultaneously, causing delays due to memory synchronization.
Why are distributed-memory systems often more scalable than shared-memory systems?
Because each processor has its own memory, reducing contention and allowing more processors to be added without a shared bottleneck.
A program uses MPI_Reduce to sum values from all processors. Which processor receives the final result?
The root process.
When would MPIAllreduce be preferred over MPIReduce?
When all processors need the final reduced value rather than just the root.
What problem occurs if two threads update the same variable simultaneously without synchronization?
Race condition.
Why must critical sections be protected by synchronization mechanisms?
To ensure only one thread accesses shared resources at a time and prevent inconsistent results.
What is the difference between a mutex and a semaphore?
A mutex allows only one thread to access a resource, while a semaphore allows a limited number of threads based on a counter.
What situation can lead to deadlock in parallel programs?
When multiple processes hold resources while waiting for other resources held by each other in a circular dependency.
Why can MPISend followed by MPIRecv between two processes sometimes cause deadlock?
Both processes may wait indefinitely for the other to send first.
How does MPI_Sendrecv help prevent deadlock?
It performs the send and receive simultaneously within a single call.
Why are atomic operations important in multithreaded programs?
They ensure operations complete fully without interruption, preventing race conditions.
Why might adding more processors decrease efficiency in a parallel program?
Increased communication overhead, synchronization costs, and idle processors reduce efficiency.
Why is latency more important than bandwidth for small messages in parallel systems?
Because the startup delay dominates when little data is transmitted.
Why is bandwidth more important than latency for large data transfers?
Because the transfer rate determines how quickly large amounts of data move.
A GPU applies the same operation to millions of pixels simultaneously. What parallel model does this represent?
SIMD.
Multiple CPU cores independently execute different programs or tasks simultaneously. What model does this represent?
MIMD.
Why is the Von Neumann bottleneck still relevant in modern computing?
The CPU can process instructions faster than memory can supply data, limiting performance.
Why is load balancing important in parallel programs?
If work is unevenly distributed, some processors finish early and sit idle while others continue working.
A parallel system shows speedup of 7 when using 8 processors. What does this indicate about efficiency?
Efficiency = 7/8 = 87.5%, which is relatively high.
If communication time becomes larger than computation time in a parallel program, what happens to scalability?
Scalability decreases because communication overhead dominates.
Why might distributed-memory systems require explicit communication (MPI)?
Because each processor has private memory and cannot directly access another processor's memory.
Why do threads in shared-memory systems communicate faster than MPI processes?
Threads share the same memory space and do not require message passing.
A program frequently locks and unlocks a mutex around a shared variable. What performance issue might occur?
Contention and reduced parallel performance due to excessive locking.
Why are embarrassingly parallel problems ideal for parallel computing?
They require little or no communication between processors.
A program divides work among processors but one processor receives much more work than others. What problem is this?
Load imbalance.
A reduction operation is performed repeatedly in a parallel program. What optimization could improve performance?
Tree-based reduction to reduce communication time.