1/89
Comprehensive vocabulary flashcards covering key terms and concepts from ECE 7400 lectures on concurrency, parallelism, synchronization primitives, design patterns, and OpenMP.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Concurrent Computing
A computing model where multiple computations are executed during overlapping time periods, emphasizing correct handling of simultaneous execution flows.
Parallel Computing
A computing approach in which many calculations or processes are carried out simultaneously to achieve faster run-times.
Distributed Computing
Computation performed on components located on different networked computers that communicate to achieve a common goal.
Concurrency (in practice)
The situation of multiple execution flows (e.g., threads) accessing shared resources at the same time, not necessarily for speed.
Parallelism (in practice)
Using multiple processing resources (CPUs/cores) simultaneously to solve a problem faster.
Shared Resource
A data structure or device accessed by more than one thread or process concurrently.
Responsiveness
Ability of an application to remain reactive to user or system events by delegating time-consuming tasks to separate threads.
Failure Isolation
Design principle where an exception in one concurrent task does not stop other tasks from running.
Moore’s Law
Observation that transistor counts on integrated circuits double approximately every two years.
Multicore Machine
A single computer containing two or more independent processing cores on one chip.
GPU (Graphics Processing Unit)
Hardware accelerator originally for graphics; now widely used for massively parallel computations with many simple ALUs.
Flynn’s Taxonomy
Classification of computer architectures into SISD, SIMD, MISD, and MIMD based on number of instruction and data streams.
SISD
Single Instruction, Single Data – traditional sequential machine executing one instruction on one piece of data at a time.
SIMD
Single Instruction, Multiple Data – architecture where one instruction operates on multiple data items simultaneously.
MISD
Multiple Instructions, Single Data – rare architecture used mainly for fault-tolerant systems.
MIMD
Multiple Instructions, Multiple Data – architecture with many independent processors; includes most multicore CPUs and GPUs.
Shared-Memory Machine
Parallel system where multiple CPUs access a single, common memory address space.
Distributed-Memory Machine
Parallel system composed of separate computers that communicate via message passing; each has its own local memory.
Master-Worker Model
Shared-memory configuration where some processors have dedicated roles (e.g., I/O or graphics).
Symmetric Multiprocessor (SMP)
Shared-memory system in which all CPUs are identical and have equal access to memory.
Speedup
Ratio tseq / tpar measuring improvement of parallel program over its sequential counterpart.
Efficiency (Parallel)
Speedup divided by number of processing units; indicates average utilization of each unit.
Amdahl’s Law
Upper bound on speedup: 1 / ( (1 − α) + α/N ), showing limited gains if sequential fraction is non-zero.
Gustafson-Barsis Law
Alternative speedup model assuming problem size scales with processors; predicts higher potential gains than Amdahl.
Linear Speedup
Ideal case where speedup equals number of processors (efficiency = 100%).
Super-Linear Speedup
Rare situation where speedup exceeds number of processors, often due to cache effects.
Profiling
Measuring where a program spends time to guide optimization; done via instrumentation or sampling.
Instrumentation
Profiler technique that inserts extra code to gather execution data, requiring recompilation.
Sampling (Profiling)
Profiler method that periodically interrupts execution to record current function without modifying code.
Scalability
Ability of a parallel program to maintain efficiency as problem size or processor count increases.
Process
An executing instance of a program, containing one or more threads and its own resources.
Thread
Smallest unit of execution scheduled by the OS; shares process resources but has its own stack.
Fork (Process)
System call that duplicates a process, creating a child with its own copy of code and data.
Run-Time Stack
Per-thread memory region holding local variables, return addresses, and function call information.
Thread Spawning
Creating a new thread within a process to execute concurrently with the parent thread.
Race Condition
Program anomaly where result depends on relative timing of events due to unsynchronized access to shared data.
Data Race
Specific race condition where two or more threads access the same memory location concurrently and at least one access is a write.
Atomic Operation
Uninterruptible action that either completes entirely or not at all, preventing intermediate inconsistent states.
Critical Section
Portion of code that must be executed by only one thread at a time to avoid races.
Mutex (Mutual Exclusion)
Synchronization object allowing one thread at a time to enter a critical section.
Semaphore
Synchronization primitive with an integer counter and atomic acquire/release operations, used for locking, resource counting, or signaling.
Binary Semaphore
Semaphore initialized to 1, functioning similarly to a mutex.
Counting Semaphore
Semaphore initialized to value >1, representing available instances of a resource.
Deadlock
Situation where a group of threads cannot proceed because each is waiting for resources held by others.
Starvation
Condition where a thread is indefinitely delayed from making progress, often due to scheduling or resource contention.
Acquire (P)
Semaphore operation that decrements the counter and blocks the thread if the result is negative.
Release (V)
Semaphore operation that increments the counter and wakes a waiting thread if any exist.
Producer-Consumer Problem
Classical synchronization scenario where producers generate data placed into a buffer and consumers remove it.
Blocking
State in which a thread is suspended, waiting for a condition or resource before continuing.
RAII
‘Resource Acquisition Is Initialization’ – C++ idiom tying resource lifetime to object lifetime via constructors/destructors.
Smart Pointer
C++ template object that manages dynamic memory automatically using RAII semantics.
unique_ptr
Smart pointer type with sole ownership of a dynamically allocated object; non-copyable, move-only.
shared_ptr
Smart pointer allowing multiple owners of an object, with reference-counted lifetime management.
Lambda Function
Inline, unnamed function object capable of capturing variables from surrounding scope.
scoped_lock
RAII wrapper that acquires one or multiple mutexes upon construction and releases them on destruction, preventing deadlocks.
Condition Variable
Synchronization primitive that blocks threads until notified, always used with a mutex.
Spurious Wakeup
Phenomenon where a thread waiting on a condition variable wakes without a corresponding notification; requires re-checking condition.
Monitor Pattern
Design encapsulating shared data with the synchronization needed to access it within a single class/module.
PCAM Methodology
Parallel design process of Partitioning, Communication, Agglomeration, and Mapping proposed by Ian Foster.
Partitioning (PCAM)
Breaking computation or data into discrete tasks that could execute in parallel.
Communication (PCAM)
Identification of data exchanges required between tasks created during partitioning.
Agglomeration (PCAM)
Combining tasks to reduce communication and overhead.
Mapping (PCAM)
Assigning tasks or task groups to processors with load balancing and locality considerations.
Decomposition Pattern
Reusable strategy for breaking a problem into parallel tasks (e.g., geometric, divide-and-conquer, pipeline).
Geometric Decomposition
Splitting data structures like arrays or grids along dimensions to create independent sub-problems.
Divide-and-Conquer
Algorithmic technique that recursively splits a problem, solves sub-problems, and merges results; parallelizable via tasks.
Task Parallelism
Parallel model where different tasks (functions or stages) run concurrently rather than splitting data.
Pipeline Pattern
Parallel structure where data items pass through a sequence of stages, each handled by a separate task.
Globally Parallel, Locally Sequential (GPLS)
Program structure where multiple tasks execute concurrently but each task runs sequential code.
Globally Sequential, Locally Parallel (GSLP)
Program structure that runs sequentially overall but executes certain regions in parallel when advantageous.
OpenMP
API of compiler directives, runtime functions, and environment variables for shared-memory parallel programming in C/C++ and Fortran.
Parallel Region (OpenMP)
Code block executed simultaneously by a team of threads created with #pragma omp parallel.
Fork-Join Model
Execution paradigm where the master thread forks a team for parallel work and joins back after completion.
Structured Block
Single-entry, single-exit code block associated with an OpenMP directive.
Thread Team (OpenMP)
Set of threads that execute a particular parallel region.
Shared Variable (OpenMP)
Data visible to all threads in a parallel region; default for variables defined outside the region.
Private Variable (OpenMP)
Data for which each thread gets its own instance inside a parallel region.
Reduction Variable (OpenMP)
Variable for which each thread keeps a private copy whose values are combined at region end using a specified operation.
firstprivate
OpenMP clause giving each thread a private copy initialized with the master’s value.
Barrier (OpenMP)
Implicit or explicit synchronization point where threads wait until all have arrived.
Critical Region (OpenMP)
OpenMP construct ensuring that a block of code is executed by only one thread at a time.
atomic Directive (OpenMP)
Directive specifying that a single memory update is to be executed atomically without full critical-section overhead.
static Schedule
OpenMP loop schedule assigning fixed iteration chunks to threads in a round-robin manner.
dynamic Schedule
OpenMP schedule where threads request new iteration chunks at run-time, aiding load balance.
guided Schedule
Dynamic schedule where chunk size starts large and decreases over time for reduced overhead.
collapse Clause
OpenMP clause that merges perfectly nested loops to increase parallel work granularity.
Loop-Carried Dependency
Data dependence where one iteration of a loop relies on results from another, hindering parallelization.
task Directive (OpenMP)
Construct that packages a code block and data environment as a unit to be executed later by any thread.
depend Clause (OpenMP)
Task clause declaring data dependencies (in, out) to enforce execution ordering between tasks.
False Sharing
Performance degradation when threads repeatedly write to distinct variables that reside on the same cache line.