CPU, System clock, primary memory, secondary memory, peripheral input and output devices, bus
Basic components of a computer
CPU
the brain of the computer.
executes instructions
controls the transfer of data across the bus
usually contained on a single microprocessor chip
Control unit, Arithmetic logic unit, registers
3 main parts of the CPU
control unit (CU)
Directs the execution of instructions
Loads an operation code (opcode) from primary memory into the Instruction Register (IR) via the bus ("the fetch")
Decodes the opcode to identify the operation
If necessary, transfers data between primary memory and registers
If necessary, directs the ALU to operate on data in registers
arithmetic logic unit (ALU)
Performs arithmetic and logical operations on data stored in registers
Eg: add numbers in 2 source registers, and store the result in a destination register
Eg: Do a bitwise AND using data in 2 registers
Registers
Binary storage units within the CPU
May contain:
Data
Addresses (locations in integers)
Instructions
Status information
The System Clock
Generates a clock signal to synchronize the CPU and other clocked devices
Is a square wave at a particular frequency (0-3.3 to 5 volts)
Devices coordinate on the rising or falling edges
Primary Memory
AKA RAM
Any byte in memory can be accessed directly if you know its address
Can be written to and read from
Volatile: data disappears after the device is powered off
Is used to store program instructions and program data (variables)
Consists of a sequence of addressable memory locations, each location is typically one byte long
Von Neumann
In _____________ architecture, RAM contains both data and programs (instructions)
Harvard
In _____________ architecture, there are separate memories for data and for programs (instructions)
bus
A set of parallel data/signal lines used to transfer info between computer components
Often subdivided into address, data and control busses
Address Bus
Specifies a memory location in RAM
From CPU to primary memory
Or sometimes a memory mapped I/O device
Data Bus
Used for bidirectional data transfer
Control Bus
Used to control or monitor any devices connected to the bus
E.g, the read/write signal for RAM
Expansion bus
May be connected to the computer's local bus
Makes it easy to connect additional I/O devices to the computer
USB, SCSI, PCIe
Secondary Memory
Is used to hold a computer's file system- stores files containing programs or data
Is non volatile read/write memory
Usually an HDD, but sometimes SDD
Peripheral I/O Devices
Allow communication between computer and the external environment
Input: Keyboard, mouse, microphone
Output: Speaker, monitor, printer
I/O: HDD, modem
Accumulator, load/store
The 2 basic CPU architectures
Accumulator machine
Operands for an instruction come from the accumulator register (ACC) and from a single location in RAM
ALU results are always put into the ACC
The ACC can be loaded from or stored to RAM
load/store machine
Only load and store instructions can access RAM
Other instructions operate on specified registers in the register file, not on RAM
Registers are more quickly accessed than RAM, so this is fast
Typical program sequence:
Load registers from memory
Execute an instruction using two source registers, putting the result into a destination register
Store the result back into memory
Reduced Instruction Set Computer (RISC)
Uses only simple instructions that can be executed in one machine cycle
Enables faster clock rates, thus faster overall execution
Makes programs larger, more complex (e.g, the original SPARC had no mult, had to be done using repeated add-shift)
Machine instructions are always the same size Makes decoding simpler and faster
E.g, ARMv8 instructions are always 32 bits wide
Complex Instruction Set Computer (CISC)
May have instructions that take many cycles to execute + provided for programmer convenience
Slows down overall execution speed
Machine instructions vary in length, and may be followed by "immediate" data
Makes coding difficult and slow
instruction cyle
The CPU executes each instruction in a series of small steps;
Fetch the instruction from memory into the instruction register (IR)
The Program Counter register (PC) contains its address
Increment PC to point to the next instruction
Decode the instruction
If the instruction uses an operand in RAM, calculate its address. Repeat if necessary
Fetch the operand. Repeat if necessary
Execute the instruction
If the instruction produces a result that is stored in RAM, calculate its address. Repeat if necessary
Store the result. Repeat if necessary
Opcode
The field that denotes the operation and format of an instruction.
Operand
An element that identifies the values to be used in a calculation.
Pseudo OPs (assembler directives)
do not generate machine instructions, but give the assembler extra information
e.g .global start
El0
ARMv8 Exception Level:
For normal user applications with limited privileges
Restricted access to a limited set of instructions and registers, and to certain parts of memory
Most programs work at this level
El1
ARMv8 Exception Level:
For the OS kernel
Privileged access to instructions, registers, and memory
Accessed indirectly by user programs using system calls
El2
ARMv8 Exception Level:
for a Hypervisor
Support virtualization, where the computer hosts multiple guest operating systems, each on its own virtual machine
El3
ARMv8 Exception Level:
Includes the Secure Monitor
31
Number of general purpose registers AArch64 has
X0-X7
Register purposes:
used to pass arguments into a procedure, and return results
x8
Register purposes:
indirect result location register
Used for returning structures
X9-X15
Register purposes:
temporary caller saved registers
x16, x17
Register purposes:
intra-procedure-call temporary registers (IP0, IP1)
x18
Register purposes:
platform register
X29
Register purposes:
frame pointer (FP) register
X30
Register purposes:
procedure link register (LR)
x19-x28
Register purposes:
Callee-saved registers
value is preserved by any function you call
the registers we use
stack pointer
Points to the top of the stack
is decremented when the stack grows
64 bits wide
Zero Register
gives value 0 when read from
discards value when written to
Program Counter
Holds the address of the currently executing instruction
Cannot be accessed directly as a named register
Is changed indirectly by branch and other instructions
Is used implicitly by PC-relative load/stores
branch-and-link instruction
An instruction that branches to an address, arguments are placed into x0-x7 before the call, the return value is placed in x0
E.g, used for printf
boilerplate
.global main main: stp x29, x30, [sp, -16]! | mov x29, sp | ← Saves State . . . ldp x29, x30, [sp], 16 | ← Restores State ret
multiplication
Uses 1 destination and 2 or 3 source registers NO IMMEDIATES ALLOWED
mul, Wd, Wn, Wm calculates: Wd = Wn x Wm
multiply add
madd Wd, Wn, Wm, Wa calculates: Wd = Wa + (Wn x Wm)
multiply subtract
msub Wd, Wn, Wm, Wa calculates: wd = wa - (wn x wm)
multiply negate
mneg Wd, Wn, Wm calculates: Wd = -(Wn x Wm)
division
NO IMMEDIATES ALLOWED
signed form: sdiv Wd, Wn, Wm = Wd = Wn / Wm
unsigned form: udiv x21, x22, x23
Used integer division The remainder (or modulus) can be calculated using num - (quotient * denominator)
the four flags
Z: true if the result is 0 N: true if the result is negative V: true if the result overflows C: true if result generates a carry out
Condition Flags
may be used to store information about the result of an instruction
Are single-bit in the CPU
Record process state(PSTATE) information
0 means false, 1 means true
2^n
An n bit register can hold ____ bit patterns
Unsigned integers
Are encoded using binary numbers
Range 0 to 2^N -1, where N is the number of bits
Signed integers
most commonly encoded using the two's complement representation Range: -2^(N-1) to +2^(N-1) -1
E.g 4-bits: -8 to +7
E.g 8-bits: -128 to +127
E.g 16-bits: -32,768 to +32,767
All positive numbers will have a 0 in the left-most bit And all the negative numbers will have 1subt
two's complement
Taking the one's complement toggle all 0's to 1's, and vice versa
adding 1 to the result
Hexadecimal
Prefixed by 0x
0, 1, 2, ..., 9, A, B, C, D, E, F A = 10, B = 11, C = 12, D = 13, E = 14, F = 15
Octal
prefixed by 0
0 to 7
May be used as a shorthand for denoting bit patterns (each digit corresponds to a 3 bit pattern)
logical shift right
lsl xd, xn, xm Xn: bit pattern to be shifted Xm: shift count
0 is shifted into the right most bit, shifted out bits are lost
quick way to do division by a power of two
DOES NOT WORK FOR NEGATIVE SIGNED INTEGERS, MUST USE ASR I.e. by 2^n, where n is the shift count
Logical Shift Left
lsl xd, xn, xm Xn: bit pattern to be shifted Xm: shift count
0 is shifted into the right most bit, shifted out bits are lost
quick way to do multiplication by a power of two I.e. by 2^n, where n is the shift count
Arithmetic Shift Right
Works like lsr (completes division by a power of 2), except sign bit is duplicated when shifting Called sign extension
Signed Extend Byte
Form (32 bit): sxtb, Wd, Wn Sign-extends bit 7 in Wn to bits 8-31
Signed Extend Halfword
Form (32 bit): sxth Wd, Wn Sign extends bit 15 in Wn to bits 16-31
Sign Extend Word
Form (64 bit): sxtw Xn, Wn Sign-extends bit 31 to bits 32-63
Unsigned Extend Byte
Form (32 bit ONLY): uxtb Wd, Wn Zero extends bits 8-31
Unsigned Extend Halfword
Form (32 bit): uxth Wd, Wn Zero extends bits 16-31
subtraction
_____________ is done in the ALU by negating the subtrahend, and then adding
e.g 7-5 = 7 + (-5) = 0111 + 1011 = (1) 0010, where we ignore the carry out
Z == 1
Signed number branching conditions:
eq (equal)
Z == 0
Signed number branching conditions:
ne (not equal)
Z == 0 && N == V
Signed number branching conditions:
gt (greater than)
N == V
Signed number branching conditions:
ge (greater than or equal)
N != V
Signed number branching conditions:
lt (less than)
!(Z == 0 && N == V)
Signed number branching conditions:
le (less than or equal)
Byte
1 byte 8 bits char in C
halfword
2 bytes 16 bits short int in C
word
4 bytes 32 bits int in C
doubleword
8 bytes 64 bits long int in C, void
Load Register
ldr X, ,addr
loads register with 8 bytes read from RAM
Load Byte
ldrb Wt, addr
32 bit only!
loads 1 byte from RAM into low order part of Wt, sign extending high order bits
Load signed byte
ldrsb Xt, addr
Loads 1 byte from RAM into low-order part of Wt or Xt, sign-extending high-order bits
load halfword
ldrh Wt, addr
32 bit only!
Loads 2 bytes from RAM into Wt, zero-extending high-order bits
Load Signed Halfword
ldrsh Xt, addr
Loads 2 bytes from RAM into Wt or Xt, sign extending high-order bits
Load Signed Word
ldrsw Xt, addr
Loads 4 bytes from RAM into Xt, sign-extending high-order bits
store register
str Xt, addr
Stores doubleword (8 bytes) in Xt to RAM
store byte
strb Wt, addr
32 bit only!
Stores low-order byte in Wt to RAM
Store halfword
strh Wt, addr
32 bit only!
Stores low-order halfword (2 bytes) in Wt to RAM
dereferencing
square brackets are used for ____________ addresses
e.g ldr x21, [x20], address is in x20
Pre-indexed addressing
address expression is calculated first
e.g ldr x21, [x20, 10]
address is x20 + 10
x20 does not change
e.g ldr x21, [x20, 10]!
address is x20 + 10
x20 changes to x20 + 10
Post-indexed addressing
address expression calculated after e.g ldr x21, [x20], 10
address is x20
x20 changes to x20 + 10
Descending stack
grows from higher to a lower address
ascending stack
grows from a lower to a higher address
Full Stack
the stack pointer points at the top of the stack (last pushed item)
empty stack
the stack pointer points to the next free space on the top of the stack
Stack Memory
space in RAM provided by the OS to store data for functions/procedures/routines
stack frame
is pushed (and created )onto the stack when the function is called
Holds the function's parameters, local variables, and return values
Is a descending, full stack
contains FP and LR for the calling routine (restored when the current routine returns)
is popped (and destroyed) when the function returns
high memory
the stack uses ___________ grows backwards (towards 0)
low memory
programs are loaded into _______________
just above the space reserved for the OS
heap
is used for dynamically allocated memory in a program
subtracting
A program can allocate stack memory by _____________ the number of bytes needed from SP
e.g sub sp, sp, 16
quadword aligned
the stack must be __________________________
the address in SP must be evenly divisible by 16
e.g to allocate 20 bytes add sp, sp, -20 & -16
frame pointer
register x29
Is used to point to local variables in a stack frame
Is stable, once set at the beginning of a function
in contrast, SP is unstable - Allowed to change as the function executes