Lecture B-1 Notes – Basic Concepts & System Buses

Computer Architecture vs. Computer Organization

  • Distinct but complementary perspectives on computer systems

    • Computer Architecture

    • Attributes visible to the programmer and directly influencing logical program execution

    • Examples

      • Instruction‐set architecture (ISA): available instructions, operand types, addressing modes

      • Word length (# bits for integer, floating-point, characters, etc.)

      • I/O mechanisms (e.g., memory-mapped vs. isolated I/O)

      • Memory addressing techniques

    • Computer Organization

    • Operational units & interconnections that realize the ISA

    • Hardware details generally transparent to programmers

      • Control signals & micro-operations

      • Interfaces between CPU ↔ peripherals ↔ memory technology

      • Pipeline, cache, bus widths, microarchitecture choices

Structure & Function of a Computer System

  • Hierarchical decomposition

    • Complex systems are viewed as sets of interrelated subsystems forming a hierarchy → designers focus on one level at a time

  • Structure (the “what connects to what”)

    • Top-down snapshot (Fig 1.1):

    • Computer ⇢ {CPU, Main Memory, I/O} connected by System Bus

    • CPU ⇢ {Control Unit, ALU, Registers, Internal Bus}

    • Control Unit ⇢ {Sequencing Logic, Control Memory, Decoders}

  • Function (the “what it does”) — four generic categories

    1. Data processing (ALU operations, logic, shifting, etc.)

    2. Data storage (short-term registers, long-term memory)

    3. Data movement

    • I/O: local peripheral transfers

    • Communications: remote transfers

    1. Control: sequencing & resource management by Control Unit

Four Main Structural Components

  • CPU – executes instructions & controls overall operation

  • Main Memory – holds instructions/data

  • I/O Modules – interface to external world

  • System Interconnection (System Bus) – shared pathways enabling communication among CPU, memory, I/O

CPU Internal Structure

  • Control Unit (CU) – fetch/decode instructions, generate micro-operations

  • Arithmetic & Logic Unit (ALU) – integer & logical ops

  • Registers – high-speed on-chip storage (data, address, status)

  • CPU Interconnection – internal bus linking CU, ALU, registers

Multicore Computer Structure

  • Terminology

    • Processor (chip): physical silicon piece containing ≥1 cores

    • Core: independent computation engine (ALU + CU + registers). Specialized cores (GPUs, DSPs) possible.

    • Multicore Processor: chip with multiple general or specialized cores

  • Reasoning: exploit parallelism, lower power per performance vs. increasing clock rates

  • Cache Hierarchy

    • L1: per-core, smallest & fastest

    • L2, L3: progressively larger/slower, may be shared among cores

    • Multilevel caching ⟹ reduced average memory latency

  • Example 8-Core Die (Slide 11)

    • Each core contains

    • Instruction logic (fetch/decode)

    • ALU

    • Load/Store unit interacting with L1/L2 caches

    • Shared L3 cache central on die

Von Neumann Architecture – Three Key Concepts

  1. Single read/write memory stores BOTH data & instructions

  2. Memory is addressed linearly by numeric location, independent of content type

  3. Program execution is inherently sequential (unless modified by control instructions)

  • Hardwired vs. Stored-program debate: flexibility won → software sequences of instruction codes replace physical rewiring

Hardware vs. Software Approaches (Fig 3.1)

  • Hardware programming: each micro-operation encoded in control lines ➔ inflexible

  • Software programming: define instruction codes; CU interprets codes to generate control signals dynamically ➔ general-purpose

  • Components to support software approach

    • Instruction Interpreter (in CU)

    • General-purpose ALU

    • I/O modules for external interaction

Memory & I/O Registers

  • Memory Address Register (MAR) – holds address of next memory word access

  • Memory Buffer Register (MBR) – data word read/written

  • I/O Address Register (I/OAR) – selects peripheral port/device

  • I/O Buffer Register (I/OBR) – data exchanged with I/O module

Basic Instruction Cycle

  • Loop of Fetch Cycle + Execute Cycle until HALT/error

    1. Fetch

    • Use Program Counter (PC) to place address on MAR

    • Memory read → instruction loaded into Instruction Register (IR)

    • PC incremented to next sequential instruction

    1. Execute – one of four action classes

    • Processor–memory transfer

    • Processor–I/O transfer

    • Data processing (ALU)

    • Control (alter PC/condition codes)

  • Processing for one instruction = Instruction Cycle (Fig 3.3)

Detailed Instruction Cycle States (Mnemonic Codes)

  • \text{iac}\rightarrow\text{if}\rightarrow\text{iod}\rightarrow[\text{oac}\rightarrow\text{of}]\rightarrow\text{do}\rightarrow\text{os} (some states optional/repeated)

    • iac: calculate next instruction address

    • if: fetch instruction

    • iod: decode opcode & addressing mode

    • oac/of: obtain operand address/data if needed

    • do: perform operation

    • os: store result

Program Execution Example (16-bit Simple Machine)

  • Word size: 16\ \text{bits}; 4-bit opcode → 2^{4}=16 opcodes

  • Address field 12 bits → 2^{12}=4096 words addressable

  • Key opcodes (hex)

    • 1: LOAD AC, 2: STORE AC, 5: ADD to AC

  • Sample fragment: AC ← M[940]; AC ← AC + M[941]; M[941] ← AC

    • Step-wise sequence (Slides 24–26) illustrates fetch/execute progression and PC updates

Interrupts – Motivation & Mechanism

  • Goal: improve CPU utilization by overlapping I/O latency with processing

  • I/O without Interrupts (Polling)

    • CPU waits idle or repeatedly tests device status

    • Example: printer write ⇒ hundreds/thousands wasted cycles

  • I/O with Interrupts

    • CPU issues command, continues user code

    • Device raises interrupt on completion or error

    • CPU state/context saved; control transfers to Interrupt Handler; after service, resume user code

  • Classes of Interrupts

    • Program (exceptions): overflow, divide-by-zero, illegal op, memory violation

    • Timer: periodic OS functions

    • I/O: device request/completion/error

    • Hardware failure: power, parity, etc.

  • Instruction Cycle with Interrupts

    • Added Interrupt Cycle after Execute, when interrupts enabled

    • If pending flag detected: save PC & status, load handler address, disable further interrupts or allow nesting depending on scheme

  • Multiple Interrupt Handling

    • Sequential (simple) vs. Nested (priority-based) processing (Fig 3.13, 3.14)

Direct Memory Access (DMA)

  • Offloads bulk data transfers from CPU

    • CPU grants bus mastership to DMA controller

    • Controller performs block read/write memory ↔ device

    • CPU interrupted only at completion ➔ minimal involvement

Interconnection Structures – Buses

  • System Bus: shared medium linking processor, memory, I/O

  • Transfer types supported

    • Memory↔Processor (instruction/data fetch, write-back)

    • Processor↔I/O (control, data)

    • I/O↔Memory (DMA)

  • Bus Lines

    • Data Bus – width (e.g., 32, 64, 128) determines parallel data transfer size ⇒ performance impact

    • Address Bus – identifies source/destination; width restricts maximum addressable memory (2^{\text{width}} locations). High bits select module, low bits select location/port.

    • Control Bus – command & timing (Read/Write, Memory/IO, Interrupt Requests, Bus Grant, Clock, etc.)

  • Operation protocol

    1. Bus request/grant arbitration

    2. Master places address & command; data moved on data lines or awaited

Processor Architecture Examples

  • Intel x86 Family (CISC Evolution)

    • 8080 → first GP microprocessor (8-bit)

    • 8086/8088 → 16-bit; prefetch queue; start of x86 ISA (IBM PC)

    • 80286 → 24-bit addr, 16\,\text{MB} memory

    • 80386 → 32-bit, multitasking support

    • 80486 → on-chip cache, pipelining, FPU

    • Pentium series → superscalar, branch prediction, MMX, SSE, speculative execution

    • Core/Core 2/… → multi-core (up to 10+), 64-bit extensions, AVX vectors

  • ARM Architecture (RISC)

    • Origin: Acorn RISC Machine, now Advanced RISC Machine

    • Emphasizes small die size, low power, fixed-length Thumb/Thumb-2 encodings

    • Dominant in embedded/mobile markets

    • Cortex-M3 Microcontroller Example (Fig 1.16)

    • Core + NVIC (Nested Vectored Interrupt Controller)

    • Bus matrix linking flash, SRAM, peripherals

    • Integrated timers, ADC/DAC, DMA, debug (ETM, DAP), security & power mgmt

Embedded Systems Overview

  • Definition: dedicated computing within a larger product; billions shipped yearly

  • Characteristics

    • Tightly coupled to physical environment (sensors ↔ actuators) ⇒ real-time constraints (deadlines, precision, concurrency)

    • Must meet constraints on speed, energy, cost, reliability

  • Generic organization (Fig 1.14)

    • CPU + Memory

    • A/D, D/A converters interfacing sensors/actuators

    • Custom logic, human interface, diagnostics

Internet of Things (IoT)

  • Expansion of networked smart devices (appliances → micro-sensors)

  • Key ideas

    • Natural user interaction; device embedded invisibly in environment

    • Needs direct sensor/actuator interface ⇒ blend of hardware (signals) + software (interpretation/control)

    • Microcontroller executes firmware; OS often present for scheduling & networking

    • Networking (local MANETs to global Internet) enables cooperative cloud services & analytics

  • Application Domains

    • Environmental monitoring (air, water, soil quality)

    • Infrastructure management (bridges, rails, predictive maintenance)

    • Manufacturing (process control, statistics, maintenance)

    • Transportation (smart traffic lights, fleet logistics)

    • Home/Building automation (HVAC, energy metering, lighting)

Ethical, Practical, & Real-World Connections

  • Performance vs. power: multicore & RISC trends reflect energy constraints in modern workloads

  • Interrupts & DMA embody principle of asynchronous concurrency—vital in OS design & embedded real-time systems

  • Bus arbitration & shared-medium limits foreshadow modern interconnect debates (e.g., NoCs in many-core chips)

  • IoT security & privacy: the pervasiveness of embedded processors raises ethical concerns—secure boot, data protection, fail-safe operation on hardware faults (interrupt class: hardware failure)

Numerical & Formal Reference Highlights

  • Maximum opcodes with 4-bit field: 2^{4}=16

  • Direct addressable words with 12-bit field: 2^{12}=4096

  • Bus width ⇢ parallel transfer capacity: \text{Bandwidth}\propto \text{Width}\times \text{Clock Rate} (qualitatively)

Study Tips / Conceptual Links

  • Connect ISA vs. Microarchitecture: Architecture (what instructions exist) guides programmer; Organization (how they run) guides implementer

  • Relate Fetch–Decode–Execute loop to pipeline stages in modern CPUs (IF, ID, EX, MEM, WB)

  • Map Interrupt cycle to OS context switch procedure you’ve seen in systems courses

  • When reviewing bus operations, practice hand-drawing timing diagrams (address valid ➔ command ➔ data valid) to solidify control vs. data phases

  • For embedded/IoT, recall that every abstract concept (interrupts, DMA, buses) physically manifests on microcontrollers like Cortex-M3—hands-on labs help.