Accumulator
Archaic term for register. On-line use of it as a synonym for "register" is a fairly reliable indication that the user has been around quite a while
Accumulator Architecture
One operand of a binary operation is implicitly in the accumulator
Address
location of a specific data within a memory array.
address translation (address mapping)
the process by which a virtual address is mapped to an address used to access memory
Aliasing
A situation in which two addresses access the same object; it can occur in the virtual memory when there are two virtual addresses for the same physical page
ALU control lines
0000 is AND
0001 is OR
0010 is ADD
0011 is SUBTRACT
0111 PASS INPUT B
1100 is NOR
ALUOp
the 4-bit ALU control input using a small control unit that has as inputs the opcode field of the instruction and a 2-bit control field
Amdahl's Law
A formula used to find the maximum improvement possible by improving a particular part of a system.
Application binary interface (ABI)
The user portion of the instruction set plus the operating system interfaces used by application programmers. It defines a standard for binary portability across computers.
Application Binary Interface (ABI)
The user portion of the instruction set plus the operating system interfaces used by application programmers. It defines a standard for binary portability across computers.
Architectural registers
The instruction set of visible registers of a processor; for example, in MIPS, these are the 32 integer and 16 floating point registers
ARMv8 virtual memory
64-bit addressed. The upper 16 bits are not used, so only 48 bits are used.
Assembly Language
Programming language that has the same structure and set of commands as machine languages but allows programmers to use symbolic representations of numeric machine code.
Asserted
The signal is logically high or true
B.EQ
equal
B.GE
greater than or equal to
B.GT
greater than
B.LE
less than or equal to
B.LT
less than
B.MI
branch on minus
N = 1
B.NE
not equal
B.PL
branch on plus
N = 0
B.VC
branch on overflow clear
V = 0
B.VS
branch on overflow set
V = 1
base address
starting address of an array in memory
base register
register that holds an array's base address
Big Endian
A CPU or memory architecture in which the most significant byte is stored at the lowest memory address.
Block (or line)
The minimum unit of information that can be either present or not present in a cache.
Blocking
a failure to retrieve information that is available in memory even though you are trying to produce it
branch not taken or (untaken branch)
A branch where the branch condition is false and the program counter (PC) becomes the address of the instruction that sequentially follows the branch.
Branch prediction
Predicts if the branch will be taken or not
Branch prediction buffer
Also called branch history table. A small memory that is indexed by the lower portion of the address of the branch instruction and that contains one or more bits indicating whether the branch was recently taken or not.
Branch taken
A branch where the branch condition is satisfied and the program counter becomes the branch target. All unconditional branches are taken branches
Branch target address
the address specified in a branch, which becomes the new program counter if the branch is taken.
Branch target buffer
A structure that caches the destination PC or destination instruction for a branch. It is usually organized as a cache with tags, making it more costly than a simple prediction buffer.
branch-and-link instruction
An instruction that branches to an address simultaneously saves the address of the following instruction in a register
cache ready signal
set in the Compare Tag state if requested read or write is a hit
Callee
A procedure that executes a series of stored instructions based on parameters provided by the caller and then returns control to the caller.
Caller
The program that instigates a procedure and provides the necessary parameter values.
capacity miss
A cache miss that occurs because the cache, even with full associativity, cannot contain all the blocks needed to satisfy the request.
CDC 6600
This system is widely considered to have been the first supercomputer. Also first load-store architecture
Central processor unit (CPU)
The active part of the computer, which contains the datapath and control and which adds numbers, tests numbers
clocking methodology
defines when signals can be read and when they can be written
Coherence
ensures that a read of a data item returns the most recently written value of that data item. It defines what values can be returned by a read
Combinational element
An operational element, such as an AND gate or an ALU.
commit unit
The unit in a dynamic or out-of-order execution pipeline that decides when it is safe to release the result of an operation to programmer-visible registers and memory.
complementary metal-oxide semiconductor (CMOS)
Dominant technology for integrated circuits
compulsory miss (cold-start miss)
a cache miss caused by the first access to a block that has never been in the cache
Both large block sizes and prefetching may reduce compulsory misses
condition codes (flag)
4 bits are used
In MIPS, two registers are compared and the result of the comparison is stored in a third register. Then a conditional branching statement assess the value of the third register to see if the condition is true or false.
conflict miss (collision miss)
A cache miss that occurs in a set-associative or direct-mapped cache when multiple blocks compete for the same set and that are eliminated in a fully associative cache of the same size
Consistency
ensure that writes to a location by different processors are seen in the same order by all processors. It is defines when written values will be returned by a read
context switch
A changing of the internal state of the processor to allow a different process to use the processor that includes saving the state needed to return to the currently executing process.
Control
The component of the processor that commands the datapath, memory, and I/O devices according to the instructions of the program.
control hazard (branch hazard)
When the proper instruction cannot execute because the instruction that was fetched is not the one that is needed.
Control signal
A signal that directs actions within a system
conventional code
SOMETHING
Coprocessor
an additional chip that accelerates a portion of the work of a processor; in this case, it accelerated floating-point computation
Correlating predictor
A branch predictor that combines local behavior of a particular branch and global information about the behavior of some recent number of executed branches.
CPU Time Formula
(Instructions) x (CPI) x (Clock Cycle Time)
data hazard (pipeline data hazard)
instruction cannot execute because data that is needed to execute the instruction are not yet available.
Data memory access (MEM)
the data memory (DM) may be read (for a load instruction) or written (for a store instruction). For load, the right half is shaded, indicating read. (For store, the left half would be shaded).
Data transfer instruction
A command that moves data between memory and registers.
Data-level parallelism
Parallelism achieved by performing the same operation on independent data
Datapath
The component of the processor that performs arithmetic operations
Deasserted
The signal is logicall low or false.
Die
The individual rectangular sections that are cut from a wafer, more informally known as chips.
Digital Equipment Corporation (DEC)
A major American company in the computer industry from the 1950s to the 1990s
dirty bit
indicates if a page has been written since being read into memory
double precision
A floating-point value represented in 64-bit words.
Dynamic branch prediction
prediction of branches at runtime using runtime information
Dynamic Random Access Memory (DRAM)
Memory built as an integrated circuit, it provides random access to any location
Temporarily stores data that your computer is actively using, constantly needs to refresh itself to keep the data
Once the power is off, everything is gone
Edge-triggered clocking
any values stored in a sequential logic element are updated only on a clock edge
Exception Enable (Interrupt Enable)
A signal or action that controls whether the process responds to an exception or not; necessary for preventing the occurrence of exceptions during intervals before the processor has safely saved the state needed for restart.
Exception Syndrome Register (ESR)
record the cause of the exception
Execute (EX)
ALU is used to perform the instruction's operation or to compute an address, or an adder is used for branches; both are depicted using an ALU symbol.
FADDD, FSUBD, FMULD, FDIVD
Double-precision arithmetic
FADDS, FSUBS
Single-precision arithmetic
falling clock edge
1 to 0
FCMPS, FCMPD
Single- and double-precision comparison
Finite-state machine
A sequential logic function consisting of a set of inputs and outputs, a next-state function that maps the current state and the inputs to a new state, and an output function that maps the current state and possibly the inputs to a set of asserted outputs.
five-stage pipeline
five instructions will be in execution during any single clock cycle
1. IF - instruction fetch
2. ID - instruction decode and register file read
3. EX - execution or address calculation
4. MEM - data memory access
5. WB - write back
Flash memory
A nonvolatile semiconductor memory. It is cheaper and slower than DRAM but more expensive per bit and faster than magnetic disks.
forwarding (bypassing)
Pass data from pipeline to another without waiting for it to be written to a register file
frame buffering
holds onto a image/video so it can be displayed
Fully associative cache
A cache structure in which a block can be placed in any location in the cache.
Hit rate
The fraction of memory accesses found in a level of the memory hierarchy.
Hit time
The time required to access a level of the memory hierarchy, including the time needed to determine whether the access is a hit or a miss.
IBM 360/91
Introduced many new concepts, including dynamic detection of memory hazards, generalized forwarding, and reservation stations. Tomasulo's algorithm
IBM 7030
AKA Stretch
Produced with the goal of being 100 times faster than the previous IBM 704
Imprecise interupt
The unpopularity of imprecise interrupts led to the standard of commit units in dynamically scheduled pipelined processors
in-order commit
A commit in which the results of pipelined execution are written to the programmer-visible state in the same order that instructions are fetched.
Instruction decode (ID)
Pull apart the instruction, set up the operation in the ALU, and compute the source and destination operand addresses
Instruction Fetch (IF)
Move instruction from memory to the control unit
Instruction set architecture
Also called architecture. An abstract interface between the hardware and the lowest-level software that encompasses all the information necessary to write a machine language program that will run correctly, including instructions, registers, memory access, I/O, and so on.
Instruction Set Architecture (ISA)
Also called architecture. An abstract interface between the hardware and the lowest-level software that encompasses all the information necessary to write a machine language program that will run correctly, including instructions, registers, memory access, I/O, and so on
Instruction Set Architecture (ISA)
The part of the computer architecture related to programming, including the native data types, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O.
Intel 4004
First microprocessor
L1 cache (primary cache)
a cache for a cache
L2 cache
A cache for main memory
It is faster than memory, but tends to be larger and slower than the L1 cache
LDUR
load register
Word from memory to register