CS3401 Computer Architecture and Organization - Multiprocessing

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/30

flashcard set

Earn XP

Description and Tags

Flashcards reviewing lecture notes on multiprocessing, parallelism, and GPU architecture.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

31 Terms

1
New cards

Multiprocessor

Powerful computers created by connecting many existing smaller ones; modern processors contain cores; software must work with a variable number of processors; energy is a key design issue.

2
New cards

Task-level parallelism

Utilizing multiple processors by running independent programs simultaneously; also known as process-level parallelism.

3
New cards

Parallel processing program

A single program that runs on multiple processors simultaneously.

4
New cards

Cluster

A set of computers connected over a network that function as a single large multiprocessor.

5
New cards

Multicore Microprocessor

A microprocessor containing multiple processors (“cores”) in a single integrated circuit.

6
New cards

Shared Memory Processors (SMPs)

Multicores that share a single physical address space.

7
New cards

Why must programmers today care about parallel programming?

Sequential code now means slow code; to achieve performance, programs must be parallel.

8
New cards

Sequential vs. Concurrent Software

Compiler: programs are sequential (parsing, code generation). OS: programs are concurrent (cooperating processes, I/O events).

9
New cards

Intel Pentium 4 vs. Intel Core i7

Pentium 4 was a uniprocessor (single core), while the Core i7 is a multicore processor.

10
New cards

Why is it difficult to write parallel processing programs?

Need better performance or energy efficiency than sequential; improvements in uniprocessor designs improved sequential programs without programmer involvement.

11
New cards

Explain Amdahl’s Law

Describes the maximum speedup achievable from parallelizing a task, limited by the sequential portion of the task.

12
New cards

Strong Scaling

Speed-up achieved on a multiprocessor without increasing the size of the problem.

13
New cards

Weak Scaling

Speed-up achieved on a multiprocessor while increasing the size of the problem proportionally to the increase in the number of processors.

14
New cards

Flynn’s Taxonomy

Categorization of parallel hardware based on instruction streams and data streams: SISD, SIMD, MISD, MIMD.

15
New cards

SISD

Single Instruction, Single Data: Conventional uniprocessor.

16
New cards

SIMD

Single Instruction, Multiple Data: Operates on vectors of data.

17
New cards

MISD

Multiple Instruction, Single Data: Stream processor performing computations on a single data stream in a pipelined fashion.

18
New cards

MIMD

Multiple Instruction, Multiple Data: Separate programs that run on different processors; often programmed using SPMD (Single Program Multiple Data).

19
New cards

Vector Architecture

SIMD interpretation where data elements are collected from memory, operated on sequentially in registers using pipelined execution units, and results written back to memory.

20
New cards

Vector Registers

A key feature of vector architectures in which a set of registers each contain multiple data elements, enabling pipelined processing.

21
New cards

Hardware Multithreading

Increases processor utilization by switching to another thread when one thread is stalled.

22
New cards

Fine-Grained Multithreading

Switches between threads on each instruction; can hide throughput losses from both short and long stalls but slows down individual threads.

23
New cards

Coarse-Grained Multithreading

Switches threads only on expensive stalls; less likely to slow down individual threads but limited in ability to overcome throughput losses from shorter stalls.

24
New cards

Simultaneous Multithreading (SMT)

Uses the resources of a multiple-issue, dynamically scheduled pipelined processor to exploit thread-level parallelism and instruction-level parallelism.

25
New cards

Graphics Processing Unit (GPU)

Specialized processing hardware dedicated to problems common to computer graphics, containing hundreds of parallel floating-point units.

26
New cards

Complete Unified Device Architecture (CUDA)

Enables developers to write C-programs to execute on GPUs, making GPU capabilities accessible for non-graphical applications.

27
New cards

CUDA Thread

Programming primitive using which the compiler and hardware can group thousands of threads together for leveraging parallelism with a GPU.

28
New cards

Multithreaded SIMD Processor

The building block of a GPU architecture, which consists of a collection of SIMD processors, which is itself MIMD.

29
New cards

SIMD Thread

The machine object the hardware creates, manages, schedules, and executes is a thread of SIMD instructions

30
New cards

GPU local memory

On-chip memory that is local to each multithreaded SIMD processor and shared by the SIMD lanes within that processor, but not between processors.

31
New cards

GPU Memory or global memory

Off-chip DRAM shared by the whole GPU and all thread blocks.