Purdue CS 250 Final Exam

0.0(0)

Studied by 14 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/150

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

151 Terms

New cards

The 32-bit 2’s complement integer 0x01234567 is stored in little endian memory. When reading bytes from lowest to highest address, what is the third byte read?

0x23

New cards

Why must a C language compiler be given the identity of the target processor?

To identify the intended machine instruction and ISA

New cards

Which assembly language instruction must appear in a snippet implementing a high-level language for loop?

Branch instruction

New cards

Improving latency for both SRAM and DRAM in a two-level memory hierarchy would reduce Average Memory Access Time by improving which factors?

Hit time and miss time

New cards

Using buffered I/O reduces the number of system calls by a factor of:

The number of data objects in the buffer

New cards

Which statement is FALSE regarding the properties of pipelining and virtual memory in a computer system?

a) Pipelining increases instruction throughput

b) Multiple virtual memory spaces can exist simultaneously

c) Pipelining reduces the execution time of each instruction

d) An application program can request a device interrupt at any time

c) Pipelining reduces the execution time of each instruction

New cards

Which branch prediction method is static, not changing during the runtime of a program?

a) Predict not taken

b) Predict using a 1-bit history

c) Use a branch target buffer

d) Dynamic branch prediction

a) Predict not taken

New cards

What changes must an application programmer make to a high-level language program before compiling it for a pipelined processor?

No changes are necessary

New cards

Which of the following is part of forwarding in a high-performance pipelined processor?

a) Comparison of operand addresses

b) Utilization of demultiplexers

c) Handling of branch instructions

d) Data cache access

a) Comparison of operand addresses

New cards

Which of the following techniques reduces the number of instructions that must be executed for a given high-level language application program?

a) Loop unrolling

b) Instruction scheduling by the compiler

c) Performance benchmarking

d) Effective branch prediction

a) Loop unrolling

New cards

The meaning of a bit string in a computer system is determined by:

The representation scheme it follows

New cards

What is the primary purpose of using a branch target buffer in dynamic branch prediction?

To cache the target addresses of branch instructions

New cards

What is the effect of setting a dirty bit in a write-allocate cache?

It signals that the cache line has been modified

New cards

What aspect of a cache’s operation is most directly affected by its write policy?

The decision on when to update main memory

New cards

Which of the following scenarios is most likely to see the greatest performance improvement from the use of dynamic branch prediction over static branch prediction?

a) A loop with a large number of iterations

b) A highly predictable set of conditional operations

c) Code segments where branch instructions occur infrequently

d) Code with deep nested conditional branches

New cards

What does it mean when a CPU is described as having a Harvard architecture?

It has separate memory storage areas for data and instructions

New cards

A 64-bit architecture primarily benefits long-term performance in what way?

It enables larger amounts of virtual memory addressing

New cards

What type of cache miss is reduced by increasing the associativity of a cache?

Conflict misses

New cards

For a superscalar processor, what is the primary benefit of having multiple instruction dispatch ports?

Enables it to execute multiple instructions in parallel

New cards

In what situation can a directly mapped cache with MRU replacement policy outperform the LRU replacement policy?

a) When the workload exhibits strong temporal locality

b) When the workload shows looping behavior with loop size smaller than the cache size

c) When the access pattern regularly switches between two sets of data larger than the cache

d) When sequential memory blocks are frequently accessed

c) When the access pattern regularly switches between two sets of data larger than the cache

New cards

What feature of virtual memory allows for the execution of programs larger than the physical memory?

Demand paging

New cards

In a multiprocessor system, what is the role of the coherence protocol?

To ensure that all processors see the same memory value

New cards

What is the effect of increasing the block size in a cache memory?

Reduces compulsory misses but may increase capacity misses

New cards

What does the principle of locality refer to in the context of program execution?

Programs tend to reuse data and instructions they have recently used

New cards

What component is critical for handling interrupts in a computer system?

Interrupt service routine (ISR)

New cards

In a pipelined processor, forwarding is used to:

Reduce data hazards

New cards

In a 5-stage pipelined processor, which of the following stages is responsible for reading register operands?

a) Instruction fetch

b) Instruction decode

c) Execute

d) Memory

e) Write back

b) Instruction decode

New cards

In a virtual memory system with a page size of 4KB and a 32-bit virtual address space, how many bits are used for the page offset?

12 (4 KB = 2¹² bytes, therefore 12 bits)

New cards

Which of the following is an example of a data hazard that can be resolved using forwarding?

a) Read after write (RAW)

b) Write after read (WAR)

c) Write after write (WAW)

d) Control dependence

a) Read after write (RAW)

New cards

A processor has a 64KB direct-mapped cache with a block size of 32 bytes. How many bits are used for the cache index?

Explanation: # of index bits = log(# of cache lines). You get the number of cache lines by doing cache size/block size, which is 2¹⁶/2⁵ = 2¹¹. log(2¹¹) = 11.

New cards

In a pipelined processor, which of the following can cause a control hazard?

a) Branch instructions

b) Data dependencies

c) Structural hazards

d) Exceptions

a) Branch instructions

New cards

In a set-associative cache, which of the following determines the set to which a memory block is mapped?

a) The tag bits

b) The block offset bits

c) The index bits

d) The valid bit

c) The index bits

New cards

Which of the following is true about the difference between a write-through and a write-back cache?

a) A write-through cache updates main memory on every write, while a write-back cache updates main memory only when a block is replaced

b) A write-back cache updates main memory on every write, while a write-through cache updates main memory only when a block is replaced

c) A write-through cache has a lower miss penalty than a write-back cache

d) A write-back cache has a lower hit time than a write-through cache

a) A write-through cache updates main memory on every write, while a write-back cache updates main memory only when a block is replaced

New cards

(T/F) The output depends on the previous state of the circuit for a combinational logic circuit.

True

New cards

(T/F) The output can be expressed as a Boolean function of the inputs for a combinational logic circuit.

False

New cards

Which of the following is true about the one’s complement representation of negative numbers?

a) It has two representations for zero

b) It is asymmetric around zero

c) It is the most commonly used representation in modern computers

d) It requires a separate sign bit

a) It has two representations for zero

New cards

In a pipelined processor, which of the following is true about data forwarding?

a) It eliminates all data hazards

b) It reduces the number of stalls caused by data hazards

c) It increases the latency of the pipeline

d) It is only used in out-of-order execution

b) It reduces the number of stalls caused by data hazards

New cards

In a virtually indexed, physically tagged cache, what is the virtual address used to access?

The virtual address is used to access both the cache tags and the data

New cards

(T/F) Stalling the processor is a common technique for handling cache misses.

True

New cards

(T/F) Dropping the memory request is a common technique for handling cache misses.

False

New cards

In a demand-paged virtual memory system, what is true about the page fault rate?

It decreases as the locality of reference in the program increases

New cards

Which of the following is a common technique for improving the performance of a branch predictor?

a) Increasing the size of the BHT (branch history table)

b) Increasing the size of the BTB (branch target buffer)

c) Using a more sophisticated prediction algorithm

d) All of the above

New cards

What is the primary function of the Arithmetic Logic Unit (ALU) in a CPU?

To perform arithmetic and logical operations

New cards

What does the term “pipelining” refer to in a CPU architecture?

Executing multiple instructions simultaneously

New cards

What is a “superscalar” processor capable of?

Executing more than one instruction during a single clock cycle

New cards

In a multi-level cache architecture, where is the Level 1 (L1) cache located?

Within the CPU, closest to the ALU

New cards

What is the primary advantage of using DRAM over SRAM?

Higher cost-effectiveness

New cards

What type of memory is used to implement the cache memory in a computer system?

Static RAM (SRAM)

New cards

How does a branch predictor improve CPU performance?

By guessing the outcomes of conditional operations

New cards

What role does the control unit play in a CPU?

It directs the operation of the processor

New cards

What best describes the function of a multiplexer?

It selects one of many data sources to send to the output

New cards

Which component of a computer system is responsible for carrying out the instructions of a computer program?

The central processing unit (CPU)

New cards

What role does speculative execution play in modern CPUs?

It executes code paths that might not be needed to fill idle CPU cycles and it predicts the outcome of branches to speed up execution

New cards

Which of the following changes the mode of execution of a processor from a mode with low privilege to a mode with high privilege?

a) Execution of a branch instruction

b) A cache hit

c) Fetch of the default next instruction

d) Receipt of an interrupt during execution of a user application

e) None of the above

d) Receipt of an interrupt during execution of a user application

New cards

A cache has a dirty bit for each block. Which statement about this cache is true?

a) This cache is an L1 cache

b) This cache has one fewer bit in the tag field than if there were no dirty bit

c) This cache is a data cache

d) This cache is half as large as it would be if there were no dirty bit

e) None of the other answers is a true statement

c) This cache is a data cache

New cards

A computer has 64-bit virtual memory addresses and 1 Kibibyte pages. Page tables for this computer are implemented using three levels. What is the logarithm base two of the size of a table in the multi-level tree page table structure?

Explanation: A page is 1 Kibibyte, which is 2¹⁰ bytes. This means that there are 10 offset bits. We know that our virtual memory addresses are 64 bits, so we get the VPN bits (virtual page number) of 64 - 10 = 54. Since this is split across 3 levels, we get 54/3 = 18. This means each page table at a level needs 2¹⁸ entries to cover the addressable space at that level, giving us our answer of 18.

New cards

Which of the following is not a requirement for main memory?

a) Adequate capacity

b) Low access latency

c) Non-volatile

d) Low cost

e) Non-ROM

c) Non-volatile

New cards

Over time, as hardware becomes less and less expensive, what is an increasingly good design choice?

Interrupt-driven I/O

New cards

What is the main advantage of buffered I/O?

Reduction in the number of system calls to perform I/O

New cards

The equation, Time = max(t1, t2), is an appropriate model for execution time by an:

MIMD computer of an if-else statement

New cards

What does loop unrolling reduce?

It reduces the number of overhead machine instructions only

New cards

An L1 cache circuit is direct mapped, has 1024 sets, receives 64-bit addresses from the processor, and receives 32-byte blocks from the memory hierarchy. What question do you need to ask and have answered before having enough information to compute the total number of overhead SRAM memory cell circuits for this cache?

Is this a write-back cache? (extra dirty bit is needed if write-back)

New cards

Reacting to an L1 cache miss by immediately loading the block containing the missed item means that caching policy is focused on:

Spatial locality

New cards

How many two-input NAND gates are necessary to build one two-input OR gate with active high inputs and active high output?

Three (two are used to invert the inputs)

New cards

Which branch prediction method is considered dynamic?

a) Predict taken

b) Predict not taken

c) 1-bit history

d) Branch target buffer (BTB)

e) None of the above

c) 1-bit history

New cards

What pipeline enhancement allows a register operand to be obtained from various locations in the pipeline?

Forwarding

New cards

What is a disadvantage of using branch prediction in a processor’s pipeline?

Increased complexity of the control unit

New cards

What does a miss penalty in the context of cache memory indicate?

The extra time required to service a cache miss

New cards

Which operation is performed by the ALU when executing the instruction “sub r1, r2, r3”?

r1 = r2 - r3

New cards

What does the term “write-through” refer to in a caching system?

Data is written to both the cache and backing store simultaneously

New cards

What is the primary function of a translation lookaside buffer (TLB)?

To store the recent address translations between virtual and physical addresses to decrease the latency of page translations

New cards

A performance model for a two-level memory hierarchy of SRAM and DRAM is Average Memory Access Time = Hit Rate x Hit Time + (1 – Hit Rate) x Miss Time

Improving latency for both SRAM and DRAM would reduce Average Memory Access Time by improving:

Hit time and miss time

New cards

Measurements of computer system performance should:

a) Be reproducible

b) Be similar to actual workloads

c) Be portable among many computer systems

d) Measure time

e) Each of the other answers is correct

New cards

(T/F) Pipelining increases machine instruction processing latency.

True

New cards

(T/F) The register file is accessed by two stages of the classic 5-stage pipelined processor.

True

New cards

(T/F) Multiple virtual memory spaces may exist at the same time in a computer.

True

New cards

(T/F) An application program can request an output device to interrupt the processor on or during the next clock cycle.

False

New cards

(T/F) A multiplexer ignores an input bus.

True

New cards

A program requests to read information from an I/O device and no devices respond. The type of bus event that has occurred is:

An unassigned address error

New cards

What function used to implement an I/O buffer may have no effect when called?

Flush

New cards

What must be making use of instruction level parallelism?

Pipelined processor

New cards

What technique reduces the number of instructions that must be executed for a given HLL application program?

Loop unrolling

New cards

Which part of a virtual address is not stored in a single-level page table?

Offset (it is used to address within a page, not in the page table)

New cards

What digital logic circuit is used to select a single ALU function unit output based on the opcode?

Multiplexer

New cards

What performance impact does increasing the size of a CPU’s branch predictor table have?

Improves accuracy of prediction, potentially increasing performance

New cards

What is the result of a successful cache hit?

Data is retrieved from the cache

New cards

What occurs when an interrupt is raised while the CPU is in the middle of executing another interrupt service routine?

Depends on the priority of the new interrupt; if it’s higher, it may pre-empt the current routine

New cards

What is the primary benefit of pipelining in a computer processor?

Increases the number of instructions completed per unit time by overlapping execution stages

New cards

What is George Adams’ favorite pastime?

Torturing students

New cards

(T/F) There may exist multiple virtual memory address spaces at the same time in a running computer.

True

New cards

(T/F) A bus error occurs when two or more interfaces respond to an address.

True

New cards

What circuit implements the pointing function?

Decoder

New cards

Because a processor circuit performs the fetch-execute cycle when it is supplied with electrical power, software must ensure that there:

Is always a next instruction

New cards

Control flow in a machine language program is:

Performed by the ALU

New cards

An operand value that is known at compile time is:

Stored in a machine instruction if its magnitude is small

New cards

The speedup provided by an enhancement is H. What is the upper bound on the overall speedup possible for a system due to this enhancement?

New cards

The execution trace of a for loop of N iterations includes execution of N branch instructions. Now the loop is unrolled k times where k divides N. The execution trace of the unrolled loop includes just 20% as many branch instructions as without unrolling. What is the value of k?

5 (1/0.2 = 5)

New cards

Which of the following are NOT true in a pipelined processor?

a) Bypassing can handle all RAW hazards.

b) Register renaming can eliminate all register carried WAR hazards.

c) Control hazard penalties can be eliminated by dynamic branch prediction

a) Bypassing can handle all RAW hazards and c) Control hazard penalties can be eliminated by dynamic branch prediction

New cards

In designing a computer’s cache system, the cache block (or cache line) size is an important parameter. Which one of the following statements is correct in this context?

a) A smaller block size implies better spatial locality

b) A smaller block size implies a smaller cache tag and hence lower cache tag overhead

c) A smaller block size implies a larger cache tag and hence lower cache hit time

d) A smaller block size incurs a lower cache miss penalty

100

New cards

(T/F) Unless enabled, a CPU will not be able to process interrupts.

True