1/61
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
List 4 examples of causes behind slow performance:
Overhead
Non Parellelizable computational tasks
Idle processor
Contention for Resources
What are 4 things you need to think of when considering overhead
Communication
Synchronization
Computation
Memory
True or False, sequential programs communicate, thread wise
False, sequential programs have no communication; single data transitions are overhead for parallel code
True or false, caching between threads and shared counters are good examples of communication?
True
the impact/cost of communication overhead comes down to the BLANK
comes down to the details of the hardware
Synchronization overhead arises when?
Synchronization overhead arises when a thread or process is waiting for another thread to complete
most BLANK algorithms will perform more computations than a BLANK version of the program
most parallel algorithms will perform more computations than a sequential version of the program
True or False, most of the times, memory overhead DOES hurt performance
False, most of the times it DOES NOT hurt performance, but memory architecture and hardware CAN play a factor
what is contention, in terms of a parallel implemented system?
the degradation of system performance due to competition for shared resources
True or False, I/O devices create contention for resources, and can degrade performance
True
list 3 examples of why a process or thread might not be able to proceed with a task? (reasons behind idle time)
Lack of work to do
Waiting for an external resource
Waiting for another thread to complete
Idle time is often a consequence of BLANK and BLANK overheads
it is often a consequence of synchronization and communication overheads
What are the two classifications of idle times (taxonomic names)
Load Imbalance
Memory-Bound Computations
Load imbalance refers to?
Uneven distribution of work to the processors available
What is an extreme example of a load imbalance, and why?
Sequential processing, because multiple processors are idle, while one is handling all of the necessary calculations
True or False, tree based parallel summation, where one branch/thread waits on many children branches to finish before summing the total, is NOT considered a Load Imbalance.
False, this is most definitely an imbalance because on thread must wait for others to complete, and sits idle while doing so.
What does the classification of memory bound computations define?
Threads/Processes can stall if waiting for memory operations, especially if trying to read or write from/to a locked mutex
True or False, Memory hardware architecture does not matter at all when considering memory bound computations.
False, bus architecture of the chip can affect bandwidth, and read/write speeds can affect latency
Weak scaling defines increasing the BLANK of a problem to tackle in a BLANK
Weak scaling defines increasing the size of the problem to tackle in a specific amount of time
The weak scaling calculation is BLANK processors with a BLANK amount of data/tasks per processor
The weak scaling calculation is N processors with a fixed amount of data/tasks per processor
Weak scaling accomplishes BLANK work in BLANK amount of time
Weak scaling accomplishes more work in the same amount of time
what is an example of weak scaling (arbitrary question more to refresh memory)?
Imagine it takes a single processor 20 minutes to make and decorate 6 cookies, if you add a second processor, you can make double the cookies (12 cookies) in 20 minutes
What is Strong Scaling?
Accomplishing a given task faster
Strong Scaling involves BLANK processors with a BLANK size
Strong Scaling involves N processors with a fixed total problem size
What is an example of a Strong Scaling solution (think of cookie decoration)
If one processor takes 20 minutes to make and decorate 6 cookies, Imagine implementing parallel in such a way that 2 processors make 6 cookies in 10 minutes (not always half the time)
True or False, Weak vs Strong scaling states that Weak Scaling will do a set amount of work in less time, whereas Strong Scaling will do more work in a set amount of time
False,
Weak Scaling will do more work in a set amount of time
Strong Scaling will do a set amount of work in less time
Throughput is the measurement of BLANK per BLANK
Throughput is the measurement of actions per unit time
Latency is measured in BLANK and involves the BLANK to perform the task once
Latency is measure in units of time and involves the time it takes to perform the task once
The calculation of latency is?
Latency = time/task
Speedup is a metric used to measure the BLANK of a program
Speedup is a metric to measure the effectiveness of a program
The calculation for program speedup is?
Speedup = sequential time/parallel time
What is Amdahl’s law?
Amdahl’s law is the mathematical equation that can be used to estimate the potential speedup for processing
(fun tidbit) when was Amdah'l’s law developed?
1967
Amdahl’s law’s equation is?
Slatency (s) = 1/(1-P) + (P/S)
In Amdahl’s law, the S stands for BLANK, and the P stands for BLANK
The S stands for Speedup of the parallelized portion, while the P stands for the portion of a program that is parallelizable
True or False, having access to the equation for speedup means we CAN run parallel code on a single processor
False, this does not mean we can run the parallel code on a single processor. (if confused, slide 35 of week 5). There is overhead, contention, and idle time to consider
Efficiency represents what?
Efficiency represents how well the resources are being utilized in a system
The calculation for efficiency is?
Efficiency = Speedup/(# of processors used)
True or False, having an increase in processors used in a program ALWAYS increases efficiency
False, especially considering that the efficiency calculation has to include the number of processors used as a denominator in efficiency calculation
Fill in the blanks for performance trade off examples:
Communication vs BLANK
BLANK vs Parallelism
Overhead vs BLANK
Communication vs Computation
Memory vs Parallelism
Overhead vs Paralellism
Communication costs can be reduced by using which 2 methods?
Overlapping communication and computation
Redundant computation
True or False, communication solutions/approaches cause more computation at less of a cost than the communications in a program
True
Overlapping communication and computations involves what?
looking for computation that is independent of comms
executing independent computations sequentially
Redundant computation involves what?
evaluating data/values being transmitted between threads or processes
evaluating the cost of these computations
if there is a low cost of computation, threads should re calculate values on their own, rather than communicate with one another
True or False, parallel processing can often be increased when memory usage is decreased
False, parallel processing can often be increased when memory usage is increased
What are the two concepts of Memory vs Parallel solutions?
Privatization (using additional mem to break false dependencies)
Padding (allocating extra mem to force variables to reside in their own cache lines)
True or False, A processor can access its L1 cache much faster than system memory on via/on a bus
True, think about the different chip architectures
In terms of Overhead vs Parallelism, as the number of threads increases, so too does the BLANK and the BLANK
so too does the overhead and lock contention
What are 3 things to be considered, when thinking about Overhead vs Parallelism?
Parallelize Overhead
Load Balance vs. Overhead
Granularity
Paralellizing Overhead approach looks at your BLANK?
The Paralleliz Overhead approach looks at your critical sections of code
What is a question to ask when considering Parallelizing Overhead?
Is there a single shared memory being used for final results
A solution to a single shared memory would be the implementation of a BLANK style approach?
A Tree style approach, where each branch has its own shared variable that gets passed up to the parent branch (tree trunk if you will), enabling summation of branch calculations at the highest level while lower level threads can still crunch numbers (i.e. no waiting on any level)
Load balancing is when we ask what?
are we using all of the processors/cores to the maximum efficiency, or is stuff idle?
True or False, it is easier to distribute smaller numbers of coarse grained code sections than it is to distribute work for a large number of fine grained code sections?
False,
It is easy to distribute a large number of fine grained units evenly
It is harder to distribute a smaller number of coarser units evenly
Load Balance vs Overhead involves evaluating what?
Evaluating how you segment your processing tasks/data
Contention between Load Balance and Overhead happens when work is BLANK or BLANK
Happens when the amount of work is Irregular or Dynamically variable
Many trade offs discussed in week 5 are related to the BLANK of paralellism
Many of these trade offs are related to the granularity of paralellism
True or False, one way to increase the number of dependencies is to increase the granularity
False, one way to decrease (not increase) the number of dependencies is to increase the granularity
What is Batching, when discussing granularity?
Batching is a process in which work can be performed as a group, such as placing threads in a group of shared resources.
Batching increases the BLANK of interactions, but reduces the BLANK of interactions, reducing BLANK overhead
Batching increases the granularity of interactions, but reduces the frequency of interactions, reducing Communications Overhead
True or False, there is NO limit to granularity
False
Granularity limits depend on which 2 things?
Algorithmic Characteristics, and Hardware Characteristics.