Architecture Basics Notes (CSCI 45)

Von Neumann Architecture (1945)

  • John von Neumann was an absolute legend: mathematician, physicist, computer scientist, and engineer.
  • Died at 53 from cancer, likely due to radiation exposure from working on the Manhattan Project.
  • His architecture model is still the main one used today.
  • The five components of von Neumann architecture:
    • The processing unit executes program instructions.
    • The control unit drives program instruction execution on the processing unit. Together, the processing and control units make up the CPU.
    • The memory unit stores program data and instructions.
    • The input unit(s) load program data and instructions on the computer and initiate program execution.
    • The output unit(s) store or receive program results.

System Buses

  • A bus is a communication channel that transfers binary values between communication endpoints (e.g., CPU and memory).
  • Types of buses:
    • Control bus: sends control signals that request or notify other units of actions.
    • Address bus: sends the memory address of a read or write request to the memory unit.
    • Data bus: transfers data between units.

The Memory Hierarchy

  • Idea: memory closer to the CPU is faster but smaller/expensive; memory farther away is larger/cheaper but slower.
  • Core trend: faster access comes at a higher cost.
  • Basic idea of levels:
    • Registers: closest to CPU; extremely fast; expensive; small quantity.
    • Caches: between CPU and main memory; faster than main memory; smaller than main memory.
    • Main Memory (RAM): larger than caches; slower than caches.
    • Secondary Storage (e.g., SSD, HDD): much larger; much slower.
    • Remote/Network Storage: slowest; accessed over networks.
  • A rough depiction of the memory hierarchy ordering by latency and capacity:
    • Registers (on CPU) — lowest latency, smallest capacity.
    • Caches (~GB) — low latency, small capacity.
    • Main Memory (~TB range in scale, but typical PCs are several GB) — higher latency.
    • Flash Disk / Traditional Disk — larger, much higher latency.
    • Remote Secondary Storage (e.g., Internet) — highest latency, large capacity.
  • Primary storage vs secondary storage terminology:
    • Primary storage (RAM, caches, registers) is fast and close to the CPU.
    • Secondary storage (SSD/HDD) is non-volatile but slower.
  • Practical takeaway: as you move data farther from the CPU, the cost per byte falls, but latency to access data rises.

Persistent File Storage

  • Question: Why not put everything in RAM?
  • Explanation: persistent files are stored in secondary storage (e.g., hard disk) and stay there when power is off.
  • Rule of thumb: farther storage from CPU is cheaper per bit, but data transfer to CPU is slower.
  • Implication: system designers use layers of storage and caching to hide latency.

RAM

  • RAM = main memory; also called primary storage.
  • It marks the end of the primary storage region, with everything after RAM considered secondary storage.
  • Programs you want to run are loaded from disk into RAM and then executed.
  • RAM serves as the workspace where active data and instructions reside during execution.

Caching Intro

  • Locality of memory access:
    • Temporal Locality: programs tend to access the same data repeatedly over time.
    • Spatial Locality: programs tend to access data nearby recently accessed data.
  • Key takeaways:
    • We gain speedups by storing commonly accessed memory closer to the CPU.
    • We gain speedups by loading contiguous chunks of memory (not just single bytes) closer to the CPU.
  • Caches are the memory between the CPU and main memory.

The Memory Hierarchy (cont.)

  • A quick view of relative access times and capacities:
    • Registers: access in ~1 cycle; tiny capacity; on-CPU storage.
    • Caches: access in ~10 cycles; small capacity; on-CPU/Microarchitectural near memory.
    • Main Memory (RAM): access in ~100 cycles; larger capacity.
    • Primary Storage (e.g., Flash Disk): access in ~10^3 to ~10^5 cycles depending on technology.
    • Secondary Storage (Disk): access in ~10^6 cycles; very large capacity.
    • Remote Secondary Storage (e.g., Internet): access in even higher latency.
  • Visualization of progression from fast/cheap to slow/expensive in terms of latency and capacity.

RAM (main memory) details

  • RAM is also known as main memory.
  • It marks the boundary between fast, expensive storage near the CPU and slower, cheaper storage further away.
  • Programs are loaded from disk into RAM before execution.

What “N-bit” means

  • A 32-bit processor uses 32-bit addresses and registers.
  • Address width and register width usually correlate with memory capacity.
  • Byte-addressed memory:
    • With a 32-bit architecture, maximum addressable memory is 232extbytes=4extGB.2^{32} ext{ bytes} = 4 ext{ GB}.
    • With 64-bit architectures, address space expands to 264extbytes=16extEB.2^{64} ext{ bytes} = 16 ext{ EB}.
  • These values illustrate why architectures evolved from 32-bit to 64-bit to accommodate more memory.

The Memory Hierarchy (revisited: recap)

  • Memory maps addresses (binary) to byte values.
  • Memory access patterns for multi-byte data: specify starting address; CPU knows to take 4/8 bytes for a load like ldr.

ISAs (Instruction Set Architectures)

  • Definition: An ISA is the encoding of instructions that the CPU understands; the language the CPU speaks.
  • An ISA also defines:
    • Supported data types.
    • The registers available.
    • The hardware support for managing main memory, etc.
  • Practical use: view binary representations using tools (e.g., objdump -d prog) to see instruction encoding.

Accessing Memory in Assembly

  • Memory is accessed via load/store instructions that operate on registers and memory addresses.
  • Key concepts:
    • Memory addresses are specified (64-bit in the example context).
    • The CPU loads/stores values from/to memory using addressing modes.

The Giant Table that is memory (addressing and data)

  • Memory maps addresses (64-bit numbers) to byte values (8-bit numbers).
  • To load multi-byte data, specify the starting address; the CPU reads the required number of bytes (e.g., 4 or 8) from that address.

Fancy ldr & str (ARM-like addressing modes)

  • Regular form with register:
    • extldrXd,[Xn] <br/>ext{ldr } Xd, [Xn] \ <br />Xd = *Xn;</li></ul></li><li>Immediateoffset:<ul><li></li></ul></li> <li>Immediate offset:<ul> <li> ext{ldr } Xd, [Xn, #4] \
      Xd=(Xn+4);Xd = *(Xn + 4);
  • Register offset:
    • extldrXd,[Xn,Xm] <br/>ext{ldr } Xd, [Xn, Xm] \ <br />Xd = *(Xn + Xm);</li></ul></li><li>Offsetwithwriteback(exclamationpoint):<ul><li></li></ul></li> <li>Offset with write-back (exclamation point):<ul> <li> ext{ldr } Xd, [Xn, #4]! \
      Xd=(Xn+4); Xn+=4;Xd = *(Xn + 4); \ Xn += 4;
  • Examples of sizes:
    • extldrWd,[Xn]extloadsa32bitvalueext{ldr } Wd, [Xn] ext{ loads a 32-bit value}
    • extldrXd,[Xn]extloadsa64bitvalueext{ldr } Xd, [Xn] ext{ loads a 64-bit value}

ldrb & strb (byte access)

  • Load/store a byte (byte-wise operations) using W registers only (32-bit width constraint for byte operations on W).
  • Example: ASCII math to convert the word "csci" to all caps using arithmetic on ASCII codes.

Lab Time

  • Practical lab activities are scheduled (refer to the course outline) to reinforce the material.

Attendance

  • Attendance is part of the course logistics (administrative item).

Types of Computer Architectures

  • Course outline/outcomes include:
    • Types of Computer Architectures
    • The Memory Hierarchy
    • Persistent File Storage
    • System Buses
    • DMA
    • ISAs

Assigned Reading and Useful Resources

  • Assigned Reading: Dive Into Systems, start Chapter 5; plan ~2 weeks to complete; it’s a chapter to read and not skim.
  • Useful ARM resources:
    • General Instructions (e.g., mov, add, bl, etc.):
    • https://developer.arm.com/documentation/dui0801/l/A64GeneralInstructionshttps://developer.arm.com/documentation/dui0801/l/A64-General-Instructions
    • Data Transfer Instructions (e.g., ldr, str, etc.):
    • https://developer.arm.com/documentation/dui0801/l/A64DataTransferInstructionshttps://developer.arm.com/documentation/dui0801/l/A64-Data-Transfer-Instructions