1/25
This set covers fundamental terminology and theoretical laws of parallel computing, including architectures, performance metrics, cache coherence, and network topologies.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Shared Memory
A parallel architecture where all processors have access to a common global address space, allowing communication through reading and writing to shared variables.
Message Passing
A parallel programming model where processors have their own private memory and communicate by explicitly sending and receiving data packets over an interconnection network.
UMA (Uniform Memory Access)
A shared memory architecture where the time to access any memory location is the same for all processors, typically implemented using a centralized memory controller.
NUMA (Non-Uniform Memory Access)
An architecture where memory is physically distributed but logically shared; the time taken to access memory depends on whether the memory is local to the processor or remote.
SIMD (Single Instruction Multiple Data)
A type of parallel computing where a single control unit broadcasts the same instruction to multiple processing elements, each operating on different data points.
Instruction Level Parallelism (ILP)
A measure of how many of the instructions in a computer program can be executed simultaneously using techniques like pipelining or superscalar execution.
Multicore Processor
A single computing component with two or more independent actual processing units (cores), which are the units that read and execute program instructions.
Speedup (S)
The ratio of the sequential execution time (T1) to the parallel execution time using N processors (TN), expressed as S=TNT1.
Efficiency (E)
A measure of how well processors are utilized, defined as the ratio of speedup (S) to the number of processors (N), expressed as E=NS.
Amdahl's Law
A formula used to predict the theoretical maximum speedup for a program with a fixed workload, expressed as S=(1−P)+NP1 where P is the parallelizable fraction of the code.
Gustafson's Law
A principle stating that as the number of processors increases, the problem size can also increase, allowing for better scalability than predicted by Amdahl's Law.
Cache Coherence
The mechanism that ensures that data stored in multiple local caches remains consistent when different processors attempt to update the same memory location.
Snooping Protocol
A cache coherence method where all cache controllers monitor (or 'snoop') the shared bus to identify if they hold a copy of the data being requested by another processor.
Directory-Based Protocol
A cache coherence method where a central directory tracks the state and location of all cached memory blocks, avoiding the scalability limits of bus-based snooping.
MESI Protocol
A specific cache coherence protocol that labels ہر cache line with one of four states: Modified, Exclusive, Shared, or Invalid.
Latency
The total time required to transmit a message from a source node to a destination node, including overhead, routing, and transmission time.
Bandwidth
The maximum rate at which data can be transferred over a communication channel or network link, usually measured in bits or bytes per second.
Bisection Bandwidth
The bandwidth available between two equal halves of a network when they are partitioned by a minimum cut; it is a measure of the network's global communication capacity.
Interconnection Topology
The physical or logical arrangement of nodes and links in a network, such as bus, ring, mesh, hypercube, or crossbar.
Hypercube Network
A network topology with 2d nodes where each node is connected to exactly d neighbors, represented as vertices of a multidimensional cube.
Wormhole Routing
A flow control technique that breaks a packet into smaller units called flits (flow control units); the header flit determines the route, and subsequent flits follow in a pipelined fashion.
Race Condition
A flaw in a parallel program where the output is dependent on the sequence or timing of uncontrollable events, such as the order in which threads are scheduled.
Deadlock
A situation in parallel computing where two or more threads are permanently blocked because each is waiting for a resource held by another.
Barrier
A synchronization point in a parallel program where all participating threads or processes must arrive before any are permitted to proceed.
MPI (Message Passing Interface)
A standardized and portable message-passing system designed to function on a wide variety of parallel computing architectures.
OpenMP
A programming interface that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, primarily using compiler directives.