1/69
Flashcards generated from lecture notes for exam review.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Given a program with 60% serial execution, what is the maximum speedup achievable with an infinite number of cores, according to Amdahl's Law?
Sp = 1 / s = 1 / 0.6 = 1.67. Therefore, the maximum speedup is 1.67.
A program has an instruction mix where ALU operations are 50% of the instructions, load/store operations are 30%, and branch operations are 20%. If ALU operations take 1 cycle, load/store take 5 cycles, and branch take 2 cycles, what is the CPI?
CPI = 0.5 * 1 + 0.3 * 5 + 0.2 * 2 = 0.5 + 1.5 + 0.4 = 2.4. Therefore, the CPI is 2.4.
Calculate the execution time for n instructions for a k-stage pipeline, given n = 1000, k = 5, and T = 1 \text{ns}.
Using the formula T{k,n} = (k + n - 1) \times T, we get T{5,1000} = (5 + 1000 - 1) \times 1 = 1004 \text{ns}. Thus, the execution time is 1004 ns.
Determine the speedup S_k of a 5-stage pipeline vs. a single-stage pipeline, where n = 1000 and k = 5.
Using the formula Sk = \frac{n \times k}{n + k - 1}, we get S5 = \frac{1000 \times 5}{1000 + 5 - 1} = \frac{5000}{1004} = 4.98. Therefore, the speedup is approximately 4.98.
A cache has a hit time of 1ns, a miss rate of 5%, and a miss penalty of 100ns. What is the average access time?
Average access time = Hit time + (Miss rate * Miss penalty) = 1ns + (0.05 * 100ns) = 1ns + 5ns = 6ns. Therefore, average access time is 6ns.
What is the formula for calculating expected access time with 3 cache levels T1 = .5ns, T2 = 5ns, T3 = 10ns, H1 = .9, H2 = .7?
H1 \* T1 + (1-H1)(H2T2 + (1-H2)(T2+T3)) = (.9.5) + (.1(.75 + .3(5+10))) = .45 + .1*(3.5 + 4.5) = .45 + .8 = 1.25ns
Calculate the speedup of a parallel processing system with 8 cores if 60% of the program can be parallelized, according to Amdahl's Law.
Speedup = 1 / (S + (1-S)/N) = 1 / (0.4 + (0.6/8)) = 1 / (0.4 + 0.075) = 1 / 0.475 = 2.105. The speedup is approximately 2.11.
A program spends 30% of its time on multiplication, 20% on division, and 50% on addition. If you can speed up multiplication by a factor of 3, what is the overall speedup?
Speedup = 1 / ( (1 - fractionenhanced) + (fractionenhanced / improvement_factor) ) = 1 / ((1 - 0.3) + (0.3 / 3)) = 1 / (0.7 + 0.1) = 1 / 0.8 = 1.25. The overall speedup is 1.25.
What is the total execution time for processing 1000 tasks through a 4-stage pipeline, where each stage takes 2ns?
Total time = (k + n - 1) * T = (4 + 1000 - 1) * 2ns = 1003 * 2ns = 2006ns. The total execution time is 2006 ns.
If a 6-stage pipeline has a clock cycle of 3ns, and you need to execute 2000 instructions, what is the total execution time?
Total time = (k + n - 1) * T = (6 + 2000 - 1) * 3ns = 2005 * 3ns = 6015ns. Therefore, the total execution time is 6015 ns.
A CPU has a clock rate of 3 GHz. How long does it take to execute an instruction that requires 4 clock cycles?
Time per cycle = 1 / clock rate = 1 / (3 * 10^9 Hz) = 0.333 ns. Execution time = cycles * time per cycle = 4 * 0.333 ns = 1.332 ns. The execution time is approximately 1.33 ns.
Calculate the CPI (cycles per instruction) for a program that executes 500 instructions and takes 1200 clock cycles.
CPI = clock cycles / instruction count = 1200 / 500 = 2.4. The CPI is 2.4.
A processor has a base CPI of 1. If branch instructions make up 20% of the code and incur a penalty of 3 cycles when taken, what is the effective CPI?
Effective CPI = Base CPI + (branch frequency * branch penalty) = 1 + (0.20 * 3) = 1 + 0.6 = 1.6. The effective CPI is 1.6.
If a cache has a hit rate of 95% and a hit time of 2ns, while the miss penalty is 50ns, what is the average memory access time?
Average access time = (hit rate * hit time) + (miss rate * miss penalty) = (0.95 * 2ns) + (0.05 * 50ns) = 1.9ns + 2.5ns = 4.4ns. The average memory access time is 4.4ns.
A system uses a virtual address space of 48 bits and a physical address space of 36 bits. What is the maximum size of virtual memory and physical memory?
Virtual memory size = 2^48 bytes = 256 TB. Physical memory size = 2^36 bytes = 64 GB.
You have a TLB with a hit rate of 80%. A TLB hit takes 1ns, and a TLB miss takes 10ns plus an additional 100ns to access memory. What is the average memory access time?
Average access time = (TLB hit rate * TLB hit time) + (TLB miss rate * (TLB miss time + memory access time)) = (0.8 * 1ns) + (0.2 * (10ns + 100ns)) = 0.8ns + 0.2 * 110ns = 0.8ns + 22ns = 22.8ns. Average memory access time is 22.8ns.
Consider a disk with a rotation speed of 7200 RPM. What is the average rotational latency?
Rotational speed = 7200 RPM = 7200 / 60 RPS = 120 RPS. One rotation takes 1 / 120 seconds = 8.33 ms. Average latency = 1/2 * rotation time = 1/2 * 8.33 ms = 4.165 ms. Average rotational latency is approximately 4.17 ms.
A disk has a data transfer rate of 100 MB/s. How long does it take to read a 10 MB file?
Time to read = file size / transfer rate = 10 MB / 100 MB/s = 0.1 seconds = 100 ms. It takes 100 ms to read the file.
Calculate the total number of bits required for a cache with 64KB of data, 4-byte blocks, and a 2-way set-associative mapping.
Cache size = 64KB = 65536 bytes. Block size = 4 bytes. Number of blocks = cache size / block size = 65536 / 4 = 16384 blocks. Number of sets = number of blocks / associativity = 16384 / 2 = 8192 sets. Index bits = log2(number of sets) = log2(8192) = 13 bits. Offset bits = log2(block size) = log2(4) = 2 bits. Tag bits = address bits - index bits - offset bits. Using 32 bit addressing, tag bits = 32 - 13 - 2 = 17 bits. Total bits per block = data bits + tag bits + valid bit = (4 * 8) + 17 + 1 = 32 + 17 + 1 = 50 bits. Total bits for cache = number of blocks * bits per block = 16384 * 50 = 819200 bits.
For a direct-mapped cache with 128 blocks and a block size of 16 bytes, how many bits are needed for the tag, index, and offset, assuming a 32-bit address?
Number of blocks = 128. Number of sets = 128 (since it's direct-mapped). Index bits = log2(128) = 7 bits. Offset bits = log2(block size) = log2(16) = 4 bits. Tag bits = total address bits - index bits - offset bits = 32 - 7 - 4 = 21 bits.
A CPU has a voltage of 1.2V and consumes 30W of power at 3 GHz. What is its dynamic power consumption if the frequency is increased to 3.6 GHz?
Dynamic power is proportional to frequency so new power = old power * (new frequency / old frequency) = 30W * (3.6 GHz / 3 GHz) = 30W * 1.2 = 36W.
If you reduce the voltage of a processor from 1.2V to 1.0V, what is the percentage reduction in power consumption, assuming the frequency remains constant?
Power is proportional to V^2, so percentage reduction = 1 - (new voltage / old voltage)^2 = 1 - (1.0 / 1.2)^2 = 1 - (0.833)^2 = 1 - 0.694 = 0.306 = 30.6%. The power consumption is reduced by approximately 30.6%.
Calculate the energy consumption of a device that operates at 100mA and 5V for 10 seconds.
Power = voltage * current = 5V * 0.1A = 0.5W. Energy = power * time = 0.5W * 10s = 5 Joules.
Assume a server consumes 200W under normal load and 300W under peak load. If it operates at normal load for 16 hours a day and peak load for 8 hours, what is its average daily energy consumption?
Normal load energy = 200W * 16 hours = 3200 Wh. Peak load energy = 300W * 8 hours = 2400 Wh. Total daily energy = normal load energy + peak load energy = 3200 Wh + 2400 Wh = 5600 Wh = 5.6 kWh.
What is the speedup if you parallelize code that originally took 20 seconds to run serially, and the parallel version takes 5 seconds on 4 cores?
Speedup = serial execution time / parallel execution time = 20 seconds / 5 seconds = 4.
If a program has 60% data dependencies, estimate the maximum achievable speedup using parallel processing with an infinite number of processors.
Maximum speedup = 1 / data dependencies = 1 / 0.6 = 1.67
A sorting algorithm takes O(n^2) time. If sorting 1000 items takes 1 second, how long will it take to sort 2000 items?
Time scales with n^2, so T2 / T1 = (n2 / n1)^2. Thus, T2 = T1 * (n2 / n1)^2 = 1 * (2000 / 1000)^2 = 1 * 4 = 4 seconds.
A search algorithm runs in O(log n) time. If searching 100 items takes 0.1 seconds, how long will it take to search 10000 items?
Time scales with log n, so T2 / T1 = log(n2) / log(n1). Thus, T2 = T1 * (log(n2) / log(n1)) = 0.1 * (log(10000) / log(100)) = 0.1 * (4 / 2) = 0.1 * 2 = 0.2 seconds.
An algorithm has a time complexity of O(n log n). If processing 1000 items takes 2 seconds, how long will it take to process 4000 items?
If n1 = 1000 takes 2s then n2 = 4000 will take 2((4000log(4000))/(1000log(1000))). log(4000) = 3.602, log(1000) = 3. Answer is = 2((43.602)/3) = 24.802 = 9.604 seconds.
Calculate the instruction fetch rate for a processor with a clock rate of 4 GHz and an average CPI of 2.
Instruction fetch rate = (Clock rate) / (CPI) = (4 * 10^9) / 2 = 2 * 10^9 instructions per second = 2 GHz.
If the miss penalty is 50 cycles, the hit time is 1 cycle, and the miss rate is 2%, what is the average memory access time in cycles?
Average memory access time in cycles = Hit time + (Miss rate * Miss penalty) = 1 + (0.02 * 50) = 1 + 1 = 2 cycles.
What is the area of a square chip with sides of 10 mm, and what is its cost if fabrication costs $5 per square mm?
Area = side * side = 10 mm * 10 mm = 100 sq mm. Cost = area * cost per sq mm = 100 sq mm * $5/sq mm = $500.
If the defect density is 0.5 defects per square cm, what is the number of defects expected on a chip that is 2 cm x 3 cm?
Area of chip = 2 cm * 3 cm = 6 sq cm. Expected number of defects = defect density * area = 0.5 * 6 = 3 defects.
Calculate the reliability of a system with 1000 components, each with a failure rate of 10^-6 failures per hour, operating for 100 hours.
Failure rate (λ) = N * individual failure rate = 1000 * 10^-6 = 0.001 failures per hour. Unreliability (Q(t)) = λ * t = 0.001 * 100 = 0.1. Reliability (R(t)) = 1 - Q(t) = 1 - 0.1 = 0.9.
If a hard drive has a MTTF (mean time to failure) of 500,000 hours, what is its annual failure rate?
Annual failure rate = 1 / MTTF * hours per year = 1 / 500,000 hours * 8760 hours/year = 0.01752 = 1.752%.
Given T1 = 1ns, T2 = 10ns, H1 = .95, H2 = .8, calculate the average access time.
Using H1 \* T1 + (1-H1)(H2T2 + (1-H2)(T2)) = (.951) + (.05(.810 + .2(10))) = .95 + .05(8 + 2) = .95 + .0510 = .95 + .5 = 1.45ns
What is the formula for calculating expected access time with 3 cache levels T1 = .5ns, T2 = 5ns, T3 = 10ns, H1 = .9, H2 = .7, H3 = .2?
H1 \\* T1 + (1-H1)(H2T2 + (1-H2)(H3 * T3+(1-H3)(T2+T3))) = (.9.5) + (.1(.75 + .3*(.2 * 10 + .8(5+10)))) = .45 + .1(3.5 + .3(2 + 12)) = .45 + .1(3.5 + .3 * 14) = .45 + .1(3.5 + 4.2) = .45 + .17.7 = .45 + .77 = 1.22ns
A program takes 10 seconds to execute on a single core. If 60% is parallelizable, what is the speedup when run on 4 cores?
Amdahl's law is 1 / (P/N + S) where P is parallelizable and S is serial. 1 / (.6/4 + .4) = 1 / (.15 + .4) = 1 / .55 = 1.818
How many bytes in a Megabyte?
Bytes = 2^10, therefore 1024 * 1024 = 1,048,576
Convert 16 MB to Kilobytes.
16 MB = 16 \* 1024 KB = 16384 KB
A processor executes 5000 instructions in 25 seconds. Calculate the execution rate.
5000 / 25 = 200. 200 Instructions per second.
A program executes 1000 instructions with a CPI of 2.5, how many cycles does the program take?
Number of cycles = instructions \* CPI = 1000 \* 2.5 = 2500 cycles. The program takes 2500 cycles.
A program has 1000 instructions and an instruction execution rate of 500 instructions/second. How long does it take to execute?
Execution time = Number of instructions / instruction execution rate = 1000 / 500 = 2 seconds
In 100 memory accesses, there are 80 hits. What is the hit ratio?
Hit ratio = Number of hits / total accesses = 80 / 100 = 0.8 (or 80%)
A cache has a hit ratio of 95%. What is the miss rate?
Miss rate = 1 - hit ratio = 1 - 0.95 = 0.05 (or 5%)
A memory has a hit time of 1 ns, a miss rate of 10%, and a miss penalty of 10 ns. Calculate the effective access time.
Effective Access Time = Hit Time + Miss Rate \* Miss Penalty = 1 + 0.1 \* 10 = 1 + 1 = 2 ns. Effective access time is 2ns.
A 128-byte cache has 32-byte blocks. How many sets are there, assuming direct mapping?
With Direct mapping, Number of sets = cache size / block size = 128 / 32 = 4
A program takes 10 seconds to execute on a 2 GHz processor. How many clock cycles?
Clock cycles = (execution time) / (clock period) = 10 / (1/(2\*10^9)) = 20,000,000,000
Calculate the dynamic power consumption of a processor with a 2 GHz clock, 1V voltage, and 0.25 active factor
Dynamic power consumption = a\f\\V^2 = 0.25 \ 2\10^9 \ (1)^2 = 500mW
Reducing voltage from 3V to 1.5V reduces power consumption by what percentage?
3V to 1.5V is a 50% reduction. Percentage reduction = 1 - (1.5/3)^2 = .75 or 75%
Calculate static power consumption with a 5V supply and 0.1mA leakage current.
static power = current\voltage = (0.1\10^-3) \*5 = .5mW
A device consumes 10W for 30 minutes. How much energy does it consume?
Energy = Power \* Time = 10W \* (1/2) = 5J
If fabricating a chip costs $5,000 per batch, with 25 wafers each yielding 50 chips at 80% yield. What is the cost per chip?
Cost per chip = total cost / ( (wafers per batch) \* (chips per wafer) \* yield )) = 5,000/ (25\50\.8)) = $5
Calculate die yield for a chip of area 2 cm^2 with a defect density of 1/cm^2, assuming perfect wafer yield
Die yield = Wafer yield \* e^-(Die area\* Defect density. \ -> Die yield = 1 \* e^-(2\*1). = 0.135 or 13.5%
A program with 1 billion instructions, CPI of 2, executes in 20 seconds. What is the clock rate?
Clock rate = Instruction count / ( CPI \* Execution time)). -> 10^9 / ( 2 \* ( 20) = 25MHz
A system processes 1000 tasks in 100 seconds. What is the throughput
Throughput - number of tasks/time = 1000/100 = 10 tasks per second
An average of 2.5 jobs arrive every second and average queue length is 5. How long is average wait time?
Little's Law: L = lambda\W. Queue Length = arrival rate \ average waiting time. 5 = 2.5 \* W. W = 2
A cache has 64 sets. How many index bits are needed?
4^3 = 64. Index = log base 2 (64) = 6 bits
The index has 6 bits and the block offset has 4 bits. With 32 bits, how many tag bits are there?
Tag bits = 32 - 6 - 4 = 22 bits
Physical memory of 32KB and a page size of 4 bytes. What is the number of frames
Number of frames = Physical Memory Size / Page Size = = 8192
A virtual address has a 12-bit page offset. What is the page size?
Page size = 2^Offset size. 2^12 = 4096Bytes = 4KB. Page Size = 4KB
A program takes 100 seconds. 80% is parallelizable. On 8 cores with no overhead, what is exeuction time?
New execution time = original\* (1-p + p/N) = 100(1-0.8 + 0.8/8) = 20 + 10 = 30
Calculate Scaled speedup for 8 cores and 80% parellelizable
Scaled speedup s(p) = p + (1-p) = 8 + (1-.8*) = 7.2
A cache has 512 blocks, each 16 bytes. What is its size?
Cache size = blocks \* block size = 512\* 16 = 8192 bytes
A 1024 byte cache is 4-way set associative with block size of 1. How many sets
Number of sets = cache size / (block size \* associativity) = 1024/4 = 256
How long to transfer 2GB file over 100MB/s link?
Time to transfer = Size / Transfer Rate = 2\10^9 / 10010^6 = 20 seconds
Memory at 200 MHz with 8-byte wide bus. Bandwidth = ?
Bandwidth = frequency \* bus width = 200 \* 8 = 1600
Given a memory trace of 100 million cycles and average memory inter-arrival of 3,000 cycles. How Many accesses?
Accesses = trace duration / average inter-arrival time = 100\*10^6/3000 = 33,333
1000 Instruction, 5 cycles and a clock rate of 1KHz. How to calculate the execution time?
ExecutionTime = Instructions \* CPI / ClockRate= 1000 \* 5 ClockRate = 5000 Cycles/s
Amdahl's law says what's the relationship of a small bit of serialization does to speedup
Amdahl's law says Speedup is serial / (serial + parallel) = 1/(.4+ .6) = 1. So 1% serial decreases speedup by a lot