1/36
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
When the address in the PC is sent to the memory to “fetch” the instruction to be
executed, what does fetching mean?
Reading the instruction by the memory, and sending it back to the CPU.
When can the address in the PC be incremented by the CPU?
After it is sent to the memory
What is the format of each one-byte instruction in a simple accumulator machine?
Op code: The leftmost 3 bits represent the op code, which specifies the operation to be performed.
Memory address: The rightmost 5 bits represent the memory address for the memory operand. The word size is 1 byte.
What is the role of the PC (Program Counter) in the simple accumulator machine?
The PC holds the address of the next instruction to be fetched from memory.
What does INC (Incrementer) do in the simple accumulator machine?
The INC increases the value of the program counter (PC) to point to the next instruction.
What is the function of the ACC (Accumulator) in the simple accumulator machine?
The ACC is a register that stores intermediate arithmetic and logic results.
What is the role of the ALU (Arithmetic Logic Unit) in the simple accumulator machine?
The ALU performs arithmetic and logical operations on data stored in the accumulator.
What does the IR (Instruction Register) do in the simple accumulator machine?
The IR holds the currently fetched instruction before it is decoded and executed.
What happens during the DECODE phase in the simple accumulator machine?
The DECODE phase interprets the instruction stored in the IR and determines the next steps for execution.
What is the purpose of the MAR (Memory Address Register) in the simple accumulator machine?
The MAR holds the address of a memory location that is being accessed for reading or writing.
What does the MDR (Memory Data Register) do in the simple accumulator machine?
The MDR temporarily holds data being transferred to or from memory.
What is the role of the Bus in the simple accumulator machine?
The internal CPU bus transfers data between different components of the CPU, separate from the memory bus.
How are the MAR, MDR, and read enable line used to read data or an instruction from memory in the simple accumulator machine?
4 steps (ARWE)
The CPU puts the address of the memory word in the MAR (m emory address register).
The CPU sets the "read enable" bit.
The CPU waits for a certain number of clock cycles.
At the end of that wait, the value at that memory address appears in the MDR (memory data register)
How are the MAR, MDR, and read enable line used to write data to memory in the simple accumulator machine?
4 steps
The CPU puts the address of the memory word in the MAR (memory address register).
The CPU puts the value to be written in the MDR (memory data register).
The CPU sets the "write enable" bit.
The CPU waits for a certain number of clock cycles, and at that point, the value is in memory at the specified address
How many clock cycles does each instruction require in the simple accumulator machine?
Each instruction takes 6 clock cycles (4 for fetch/decode, and 2 for execution)
Why does the simple accumulator machine only allow sequential execution of
instructions?
The simple accumulator machine only allows for sequential execution of instructions because the PC (program counter) is always incremented by 1. There are no branch/jump instructions, and no call or return statements.
How does use of a cache improve performance?
The CPU will not have to wait for data/instructions to come from memory if they are in the cache, and very often, they are.
About how large are CPU caches?
Typically less than 0.1% of the size of main memory/RAM
When data is moved to/from the cache, are single words moved?
No, whole blocks are moved; a block is usually 16 words or more.
How is the 16-bit address divided in a direct-mapped cache?
A 16-bit memory address is divided into three parts; tag bits, block bits, word bits (byte bits)
What is the purpose of tag bits?
Used to identify which memory location the cache block corresponds to.
What is the purpose of block bits?
Used to identify a block of memory in the cache.
What is the purpose of word bits?
Used to identify the specific word or data location within a block.
What is cache hit rate?
Cache hit rate is the percentage of memory accesses by the CPU that hit in the cache.
What is cache miss rate (definition)?
When the CPU tries to access instructions or data not in the cache, this is called a cache miss.
What are typical cache hit rates we mentioned in class for many types of software?
Hit rates in the 90% range are common in programs (though it depends on the type of software)
What is the difference between write-through and write-back protocol for updating data in the cache in memory?
Write-through protocol: This protocol updates both the value in the cache and in the main memory simultaneously.
Write-back protocol: This protocol updates only the cache location, and sets the cache block's dirty bit to 1. The dirty bit tracks whether the cache block has been written/modified since it was brought into the cache. The memory copy will be updated/written when the cache block is replaced (evicted) in the cache
Which protocol (write-through or write-back) uses a dirty bit, and how is the dirty bit used?
The write-back protocol uses a dirty bit to manage updates between the cache and main memory.
When data is modified in the cache, the write-back protocol updates only the cache location.
At the same time, the cache block's dirty bit is set to 1. This indicates that the cache block has been written to or modified since it was brought into the cache.
When the cache block needs to be replaced (evicted) to make room for new data, the dirty bit is checked.
If the dirty bit is 1, the modified data in the cache block is written back to main memory before the block is replaced. This ensures that the main memory is updated with the latest version of the data
What is the fundamental idea behind pipelining a CPU?
Overlap the stages of execution of various instructions, so that more than one instruction can be executed at any given time.
How many clock cycles are required to execute each instruction if a pipeline is not used in a 4-stage execution model?
Without a pipeline, each instruction requires 4 clock cycles to complete, one for each stage of execution.
How many instructions complete per clock cycle once the pipeline is fully loaded?
Once the pipeline is fully loaded, 1 instruction completes per clock cycle, as each stage processes a different instruction simultaneously.
What is the theoretical improvement in performance when using a pipeline in a 4-stage execution model?
The theoretical improvement in performance is a 4x speedup, as instructions can be completed in every clock cycle after the pipeline is fully loaded, compared to 4 clock cycles per instruction without the pipeline.
What can we say about the general improvement in performance when a pipeline is used?
In general, a pipeline allows multiple instructions to be processed simultaneously at different stages, improving throughput. Once the pipeline is fully loaded, one instruction can be completed per clock cycle, leading to a significant performance boost.
Will a pipeline with more stages offer more improvement in performance, or less?
A pipeline with more stages can offer more improvement in performance by allowing more instructions to be processed simultaneously. However, the performance gain is not always linear.
What did we say is a typical number of stages in modern pipelined CPUs?
More than 20
What are the potential problems in a pipeline?
Instruction hazards and data hazards.
How much do pipeline hazards typically reduce the theoretical maximum performance improvement?
Less than 10%