Architecture Basics Notes (CSCI 45)
Von Neumann Architecture (1945)
- John von Neumann was an absolute legend: mathematician, physicist, computer scientist, and engineer.
- Died at 53 from cancer, likely due to radiation exposure from working on the Manhattan Project.
- His architecture model is still the main one used today.
- The five components of von Neumann architecture:
- The processing unit executes program instructions.
- The control unit drives program instruction execution on the processing unit. Together, the processing and control units make up the CPU.
- The memory unit stores program data and instructions.
- The input unit(s) load program data and instructions on the computer and initiate program execution.
- The output unit(s) store or receive program results.
System Buses
- A bus is a communication channel that transfers binary values between communication endpoints (e.g., CPU and memory).
- Types of buses:
- Control bus: sends control signals that request or notify other units of actions.
- Address bus: sends the memory address of a read or write request to the memory unit.
- Data bus: transfers data between units.
The Memory Hierarchy
- Idea: memory closer to the CPU is faster but smaller/expensive; memory farther away is larger/cheaper but slower.
- Core trend: faster access comes at a higher cost.
- Basic idea of levels:
- Registers: closest to CPU; extremely fast; expensive; small quantity.
- Caches: between CPU and main memory; faster than main memory; smaller than main memory.
- Main Memory (RAM): larger than caches; slower than caches.
- Secondary Storage (e.g., SSD, HDD): much larger; much slower.
- Remote/Network Storage: slowest; accessed over networks.
- A rough depiction of the memory hierarchy ordering by latency and capacity:
- Registers (on CPU) — lowest latency, smallest capacity.
- Caches (~GB) — low latency, small capacity.
- Main Memory (~TB range in scale, but typical PCs are several GB) — higher latency.
- Flash Disk / Traditional Disk — larger, much higher latency.
- Remote Secondary Storage (e.g., Internet) — highest latency, large capacity.
- Primary storage vs secondary storage terminology:
- Primary storage (RAM, caches, registers) is fast and close to the CPU.
- Secondary storage (SSD/HDD) is non-volatile but slower.
- Practical takeaway: as you move data farther from the CPU, the cost per byte falls, but latency to access data rises.
Persistent File Storage
- Question: Why not put everything in RAM?
- Explanation: persistent files are stored in secondary storage (e.g., hard disk) and stay there when power is off.
- Rule of thumb: farther storage from CPU is cheaper per bit, but data transfer to CPU is slower.
- Implication: system designers use layers of storage and caching to hide latency.
RAM
- RAM = main memory; also called primary storage.
- It marks the end of the primary storage region, with everything after RAM considered secondary storage.
- Programs you want to run are loaded from disk into RAM and then executed.
- RAM serves as the workspace where active data and instructions reside during execution.
Caching Intro
- Locality of memory access:
- Temporal Locality: programs tend to access the same data repeatedly over time.
- Spatial Locality: programs tend to access data nearby recently accessed data.
- Key takeaways:
- We gain speedups by storing commonly accessed memory closer to the CPU.
- We gain speedups by loading contiguous chunks of memory (not just single bytes) closer to the CPU.
- Caches are the memory between the CPU and main memory.
The Memory Hierarchy (cont.)
- A quick view of relative access times and capacities:
- Registers: access in ~1 cycle; tiny capacity; on-CPU storage.
- Caches: access in ~10 cycles; small capacity; on-CPU/Microarchitectural near memory.
- Main Memory (RAM): access in ~100 cycles; larger capacity.
- Primary Storage (e.g., Flash Disk): access in ~10^3 to ~10^5 cycles depending on technology.
- Secondary Storage (Disk): access in ~10^6 cycles; very large capacity.
- Remote Secondary Storage (e.g., Internet): access in even higher latency.
- Visualization of progression from fast/cheap to slow/expensive in terms of latency and capacity.
RAM (main memory) details
- RAM is also known as main memory.
- It marks the boundary between fast, expensive storage near the CPU and slower, cheaper storage further away.
- Programs are loaded from disk into RAM before execution.
What “N-bit” means
- A 32-bit processor uses 32-bit addresses and registers.
- Address width and register width usually correlate with memory capacity.
- Byte-addressed memory:
- With a 32-bit architecture, maximum addressable memory is 232extbytes=4extGB.
- With 64-bit architectures, address space expands to 264extbytes=16extEB.
- These values illustrate why architectures evolved from 32-bit to 64-bit to accommodate more memory.
The Memory Hierarchy (revisited: recap)
- Memory maps addresses (binary) to byte values.
- Memory access patterns for multi-byte data: specify starting address; CPU knows to take 4/8 bytes for a load like ldr.
ISAs (Instruction Set Architectures)
- Definition: An ISA is the encoding of instructions that the CPU understands; the language the CPU speaks.
- An ISA also defines:
- Supported data types.
- The registers available.
- The hardware support for managing main memory, etc.
- Practical use: view binary representations using tools (e.g., objdump -d prog) to see instruction encoding.
Accessing Memory in Assembly
- Memory is accessed via load/store instructions that operate on registers and memory addresses.
- Key concepts:
- Memory addresses are specified (64-bit in the example context).
- The CPU loads/stores values from/to memory using addressing modes.
The Giant Table that is memory (addressing and data)
- Memory maps addresses (64-bit numbers) to byte values (8-bit numbers).
- To load multi-byte data, specify the starting address; the CPU reads the required number of bytes (e.g., 4 or 8) from that address.
Fancy ldr & str (ARM-like addressing modes)
- Regular form with register:
- extldrXd,[Xn] <br/>Xd = *Xn;</li></ul></li><li>Immediateoffset:<ul><li> ext{ldr } Xd, [Xn, #4] \
Xd=∗(Xn+4);
- Register offset:
- extldrXd,[Xn,Xm] <br/>Xd = *(Xn + Xm);</li></ul></li><li>Offsetwithwrite−back(exclamationpoint):<ul><li> ext{ldr } Xd, [Xn, #4]! \
Xd=∗(Xn+4); Xn+=4;
- Examples of sizes:
- extldrWd,[Xn]extloadsa32−bitvalue
- extldrXd,[Xn]extloadsa64−bitvalue
ldrb & strb (byte access)
- Load/store a byte (byte-wise operations) using W registers only (32-bit width constraint for byte operations on W).
- Example: ASCII math to convert the word "csci" to all caps using arithmetic on ASCII codes.
Lab Time
- Practical lab activities are scheduled (refer to the course outline) to reinforce the material.
Attendance
- Attendance is part of the course logistics (administrative item).
Types of Computer Architectures
- Course outline/outcomes include:
- Types of Computer Architectures
- The Memory Hierarchy
- Persistent File Storage
- System Buses
- DMA
- ISAs
Assigned Reading and Useful Resources
- Assigned Reading: Dive Into Systems, start Chapter 5; plan ~2 weeks to complete; it’s a chapter to read and not skim.
- Useful ARM resources:
- General Instructions (e.g., mov, add, bl, etc.):
- https://developer.arm.com/documentation/dui0801/l/A64−General−Instructions
- Data Transfer Instructions (e.g., ldr, str, etc.):
- https://developer.arm.com/documentation/dui0801/l/A64−Data−Transfer−Instructions