Parallel Computing Week 5

0.0(0)
studied byStudied by 2 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/61

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

62 Terms

1
New cards

List 4 examples of causes behind slow performance:

  1. Overhead

  2. Non Parellelizable computational tasks

  3. Idle processor

  4. Contention for Resources

2
New cards

What are 4 things you need to think of when considering overhead

  1. Communication

  2. Synchronization

  3. Computation

  4. Memory

3
New cards

True or False, sequential programs communicate, thread wise

False, sequential programs have no communication; single data transitions are overhead for parallel code

4
New cards

True or false, caching between threads and shared counters are good examples of communication?

True

5
New cards

the impact/cost of communication overhead comes down to the BLANK

comes down to the details of the hardware

6
New cards

Synchronization overhead arises when?

Synchronization overhead arises when a thread or process is waiting for another thread to complete

7
New cards

most BLANK algorithms will perform more computations than a BLANK version of the program

most parallel algorithms will perform more computations than a sequential version of the program

8
New cards

True or False, most of the times, memory overhead DOES hurt performance

False, most of the times it DOES NOT hurt performance, but memory architecture and hardware CAN play a factor

9
New cards

what is contention, in terms of a parallel implemented system?

the degradation of system performance due to competition for shared resources

10
New cards

True or False, I/O devices create contention for resources, and can degrade performance

True

11
New cards

list 3 examples of why a process or thread might not be able to proceed with a task? (reasons behind idle time)

  1. Lack of work to do

  2. Waiting for an external resource

  3. Waiting for another thread to complete

12
New cards

Idle time is often a consequence of BLANK and BLANK overheads

it is often a consequence of synchronization and communication overheads

13
New cards

What are the two classifications of idle times (taxonomic names)

  1. Load Imbalance

  2. Memory-Bound Computations

14
New cards

Load imbalance refers to?

Uneven distribution of work to the processors available

15
New cards

What is an extreme example of a load imbalance, and why?

Sequential processing, because multiple processors are idle, while one is handling all of the necessary calculations

16
New cards

True or False, tree based parallel summation, where one branch/thread waits on many children branches to finish before summing the total, is NOT considered a Load Imbalance.

False, this is most definitely an imbalance because on thread must wait for others to complete, and sits idle while doing so.

17
New cards

What does the classification of memory bound computations define?

Threads/Processes can stall if waiting for memory operations, especially if trying to read or write from/to a locked mutex

18
New cards

True or False, Memory hardware architecture does not matter at all when considering memory bound computations.

False, bus architecture of the chip can affect bandwidth, and read/write speeds can affect latency

19
New cards

Weak scaling defines increasing the BLANK of a problem to tackle in a BLANK

Weak scaling defines increasing the size of the problem to tackle in a specific amount of time

20
New cards

The weak scaling calculation is BLANK processors with a BLANK amount of data/tasks per processor

The weak scaling calculation is N processors with a fixed amount of data/tasks per processor

21
New cards

Weak scaling accomplishes BLANK work in BLANK amount of time

Weak scaling accomplishes more work in the same amount of time

22
New cards

what is an example of weak scaling (arbitrary question more to refresh memory)?

Imagine it takes a single processor 20 minutes to make  and decorate 6 cookies, if you add a second processor, you can make double the cookies (12 cookies) in 20 minutes

23
New cards

What is Strong Scaling?

Accomplishing a given task faster

24
New cards

Strong Scaling involves BLANK processors with a BLANK size

Strong Scaling involves N processors with a fixed total problem size

25
New cards

What is an example of a Strong Scaling solution (think of cookie decoration)

If one processor takes 20 minutes to make and decorate 6 cookies, Imagine implementing parallel in such a way that 2 processors make 6 cookies in 10 minutes (not always half the time)

26
New cards

True or False, Weak vs Strong scaling states that Weak Scaling will do a set amount of work in less time, whereas Strong Scaling will do more work in a set amount of time

False,

  • Weak Scaling will do more work in a set amount of time

  • Strong Scaling will do a set amount of work in less time

27
New cards

Throughput is the measurement of BLANK per BLANK

Throughput is the measurement of actions per unit time

28
New cards

Latency is measured in BLANK and involves the BLANK to perform the task once

Latency is measure in units of time and involves the time it takes to perform the task once

29
New cards

The calculation of latency is?

Latency = time/task

30
New cards

Speedup is a metric used to measure the BLANK of a program

Speedup is a metric to measure the effectiveness of a program

31
New cards

The calculation for program speedup is?

Speedup = sequential time/parallel time

32
New cards

What is Amdahl’s law?

Amdahl’s law is the mathematical equation that can be used to estimate the potential speedup for processing

33
New cards

(fun tidbit) when was Amdah'l’s law developed?

1967

34
New cards

Amdahl’s law’s equation is?

Slatency (s) = 1/(1-P) + (P/S)

35
New cards

In Amdahl’s law, the S stands for BLANK, and the P stands for BLANK

The S stands for Speedup of the parallelized portion, while the P stands for the portion of a program that is parallelizable

36
New cards

True or False, having access to the equation for speedup means we CAN run parallel code on a single processor

False, this does not mean we can run the parallel code on a single processor. (if confused, slide 35 of week 5). There is overhead, contention, and idle time to consider

37
New cards

Efficiency represents what?

Efficiency represents how well the resources are being utilized in a system

38
New cards

The calculation for efficiency is?

Efficiency = Speedup/(# of processors used)

39
New cards

True or False, having an increase in processors used in a program ALWAYS increases efficiency

False, especially considering that the efficiency calculation has to include the number of processors used as a denominator in efficiency calculation

40
New cards

Fill in the blanks for performance trade off examples:

  • Communication vs BLANK

  • BLANK vs Parallelism

  • Overhead vs BLANK

  • Communication vs Computation

  • Memory vs Parallelism

  • Overhead vs Paralellism

41
New cards

Communication costs can be reduced by using which 2 methods?

  1. Overlapping communication and computation

  2. Redundant computation 

42
New cards

True or False, communication solutions/approaches cause more computation at less of a cost than the communications in a program

True

43
New cards

Overlapping communication and computations involves what?

  • looking for computation that is independent of comms

  • executing independent computations sequentially

44
New cards

Redundant computation involves what?

  • evaluating data/values being transmitted between threads or processes

  • evaluating the cost of these computations

  • if there is a low cost of computation, threads should re calculate values on their own, rather than communicate with one another

45
New cards

True or False, parallel processing can often be increased when memory usage is decreased

False, parallel processing can often be increased when memory usage is increased

46
New cards

What are the two concepts of Memory vs Parallel solutions?

  1. Privatization (using additional mem to break false dependencies)

  2. Padding (allocating extra mem to force variables to reside in their own cache lines) 

47
New cards

True or False, A processor can access its L1 cache much faster than system memory on via/on a bus

True, think about the different chip architectures

48
New cards

In terms of Overhead vs Parallelism, as the number of threads increases, so too does the BLANK and the BLANK

so too does the overhead and lock contention

49
New cards

What are 3 things to be considered, when thinking about Overhead vs Parallelism?

  1. Parallelize Overhead

  2. Load Balance vs. Overhead

  3. Granularity

50
New cards

Paralellizing Overhead approach looks at your BLANK?

The Paralleliz Overhead approach looks at your critical sections of code

51
New cards

What is a question to ask when considering Parallelizing Overhead?

  • Is there a single shared memory being used for final results

52
New cards

A solution to a single shared memory would be the implementation of a BLANK style approach?

A Tree style approach, where each branch has its own shared variable that gets passed up to the parent branch (tree trunk if you will), enabling summation of branch calculations at the highest level while lower level threads can still crunch numbers (i.e. no waiting on any level)

53
New cards

Load balancing is when we ask what?

are we using all of the processors/cores to the maximum efficiency, or is stuff idle?

54
New cards

True or False, it is easier to distribute smaller numbers of coarse grained code sections than it is to distribute work for a large number of fine grained code sections?

False,

  • It is easy to distribute a large number of fine grained units evenly

  • It is harder to distribute a smaller number of coarser units evenly

55
New cards

Load Balance vs Overhead involves evaluating what?

Evaluating how you segment your processing tasks/data

56
New cards

Contention between Load Balance and Overhead happens when work is BLANK or BLANK

Happens when the amount of work is Irregular or Dynamically variable

57
New cards

Many trade offs discussed in week 5 are related to the BLANK of paralellism

Many of these trade offs are related to the granularity of paralellism

58
New cards

True or False, one way to increase the number of dependencies is to increase the granularity

False, one way to decrease (not increase) the number of dependencies is to increase the granularity

59
New cards

What is Batching, when discussing granularity?

Batching is a process in which work can be performed as a group, such as placing threads in a group of shared resources.

60
New cards

Batching increases the BLANK of interactions, but reduces the BLANK of interactions, reducing BLANK overhead

Batching increases the granularity of interactions, but reduces the frequency of interactions, reducing Communications Overhead

61
New cards

True or False, there is NO limit to granularity

False

62
New cards

Granularity limits depend on which 2 things?

Algorithmic Characteristics, and Hardware Characteristics.