1/33
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What are the prefixes of kilo, mega, and giga? What are their sizes (base10)?
Kilo: k, 10³
Mega: M, 106
Giga: G, 109
What are the prefixes of milli, micro, nano? What are their sizes (base10)?
Milli: m, 10-³
Micro: μ, 10-6
Nano: n, 10-9
What is throughput?
How many tasks can be done per second
What is latency?
How long one task takes
What is energy efficiency?
How much useful work is done per unit of energy
Compare intrinsic and technology-dependent issues
Intrinsic: algorithmic, design-wise issues (e.g. branch prediction)
Technology-dependent: limited by current technology/hardware (e.g. battery life)
What is capacity?
Amount of a resource that is available
What is utilization?
Percentage of capacity being used
What is overhead?
Resource that is wasted
What is useful work?
Resource spent on actual work for a given workload
Pipelining vs Concurrency in terms of latency and throughput
Pipelining
Latency: Sum of all modules’ latency
Throughput: Min of all modules’ latency
Concurrency
Latency: Averaged latency by weight and module latency
Throughput: throughput x N (using identical modules)
When can input throughput and output throughput differ?
When using compression or decompression
Can physical data transfer be faster than over the network?
Yes, for large datasets
In networking, what is one example of improving performance?
HTTP pipelining
Throughput = p/RTT → Throughput = cwnd/RTT
In memory, what is one consideration for performance?
Latency-throughput curve— where increasing throughput can improve performance up to a point, beyond the saturation threshold, latency increases significantly and degrades performance
In a hard disk example, what are three ways to improve performance?
Batching: handle requests as a group to amortize fixed overhead
Dallying: delay a request (e.g. delay disk write because user may save again soon)
Speculation: read next likely data early
What is burst?
A brief increase in the request rate above average
Why does overload happen?
When incoming request rate > average throughput for one stage. Buffers will fill up, and requests wait in the buffer → increased latency
What is Amdahl’s Law (Law of Diminishing Returns)?
Total speedup = 1 / ( (1-P) + P/S ) where S = speedup for the portion, P = % of the portion
What are 4 steps of designing for performance?
Measure the system if it needs to be faster
Measure again to find the bottleneck
Predict the impact of removing the bottleneck
a. Can it be removed? How effective?
b. If not, can we redesign it?
Implement new solution, repeat
What is workload?
Tasks processed by a system
We can measure performance on… (3)
Real workload
Simulated system
Benchmarks + real traces
Caching vs memoization
Caching has a bounded size, whereas memoization does not
What is locality of reference?
Clustering of memory references in both time and address space
What are the two types of locality of reference?
Temporal: recent items likely to be referenced again soon
Spatial: near items likely to be referenced
What is a working set?
Set of items used during an interval
What happens when working set » primary storage?
Thrashing— frequent overwriting of primary storage
Write-through vs Write-back
Write-through: writing to cache and then to secondary storage immediately
Write-back: writing to cache and to secondary storage later when evicted
What is cache coherency?
Consistency of data across multiple caches in a system
What is associativity? What are the three types?
Which data can be stored in each cache slot
Fully associative: data can be anywhere in the cache
Direct-mapped: each data is mapped to exactly one cache line
N-way associative: cache divided into sets, each set contains N cache lines
What is fetch policy?
When/how to fetch data (on-demand or by prediction)
What are the five types of removal policies?
Least recently used (LRU ): every time data is used, place/move it to the head, push the rest down
FIFO: place it at head if not in queue, remove the tail
Clock: second chance page replacement with a circular pointer (only evict if reference bit is 0, put at head if 1)
Not recently used (NRU): remove elements in the following order
Not referenced, not dirty
Not referenced, dirty
Referenced, not dirty
Referenced, dirty
OPT— optimal algorithm, used for comparison
What is Belady’s anomaly?
For some removal policies, more cache ≠ higher hit ratio!
What are the three types of a cache miss?
Compulsory miss— first reference (cannot be prevented)
Capacity miss— can be prevented by infinite cache size
Conflict miss— can be prevented by a fully-associative cache