Reduced Instruction Set Computers (RISC)

CISC vs. RISC Architectures

CISC (Complex Instruction Set Computer)

  • Characterized by a large and complex set of instructions.
  • Examples include IBM 370/168, VAX 11/780, and Intel 80486.
  • Aimed to simplify compilers and improve performance.
  • Advantages of smaller programs:
    • Less memory usage.
    • Improved performance due to fewer instruction bytes fetched.
    • Reduced page faults in a paging environment.
    • More instructions fit in cache.

RISC (Reduced Instruction Set Computer)

  • Examples include SPARC and MIPS R4000.
  • Focuses on simplifying the instruction set for better performance.
  • Key characteristics:
    • One machine instruction per machine cycle.
    • Simple LOAD and STORE operations for memory access.
    • Register-to-register operations.
    • Simple addressing modes.
    • Fixed instruction length aligned on word boundaries.
    • Simple instruction formats.

RISC Architecture Implications

  • Optimizing compilers can be more effective with primitive instructions.
  • Easier to move functions out of loops, reorganize code, and maximize register utilization.
  • Possible to compute parts of complex instructions at compile time.
  • Control units built for simple instructions can execute them faster.
  • Instruction pipelining is more effective.
  • More responsive to interrupts because they are checked between elementary operations.

Instruction Execution Characteristics

  • Operations performed determine processor functions and memory interaction.
  • Operand types and frequency influence memory organization and addressing modes.
  • Execution sequencing affects control and pipeline organization.
  • Emphasis is placed on optimizing the performance of time-consuming HLL features.

Weighted Relative Dynamic Frequency of HLL Operations

OperationPascalC
ASSIGN45%38%
LOOP5%3%
CALL15%12%
IF29%43%
GOTO3%
OTHER6%1%

Dynamic Percentage of Operands

OperandPascalCAverage
Integer constant16%23%20%
Scalar variable58%53%55%
Array/Structure26%24%25%

Procedure Arguments and Local Scalar Variables

  • Small Nonnumeric Programs (Compiler, Interpreter, and Typesetter)
    • Greater than 3 arguments: 0-7% (Pascal), 0-5% (C)
    • Greater than 5 arguments: 0-3% (Pascal), 0% (C)
    • Greater than 8 words: 1-20% (Pascal), 0-6% (C)
    • Greater than 12 words: 1-6% (Pascal), 0-3% (C)

Large Register File

  • Software Solution: Compiler allocates registers based on most used variables.
  • Hardware Solution: More registers to hold more variables.

Register Windows

  • Overlapping register windows for parameter passing and local storage.
  • Circular buffer organization of overlapped windows.

Global Variables

  • Global variables can be assigned memory locations by the compiler.
  • Alternative: Use global registers fixed in number and available to all procedures.

Large Register File vs. Cache

CharacteristicLarge Register FileCache
All local scalarsRecently-used local scalars
Individual variablesBlocks of memory
Compiler-assigned global variablesRecently used global variables
Save/Restore based on depthSave/Restore based on replacement alg.
Register addressingMemory addressing
Multiple operands per cycleOne operand per cycle

Code Size Relative to RISC I

RISC IVAX-11/780M68000Z8002PDP-11/70
11 C Programs1.00.80.91.20.9
12 C Programs1.00.670.91.120.71
5 C Programs1.0

Pipelining

  • Pipelining overlaps instruction execution to improve performance.
  • Optimization techniques:
    • Delayed Branch: Branch doesn't take effect until after the following instruction.
    • Delayed Load: Target register is locked until load completes; useful work can be done while loading.
    • Loop Unrolling: Replicates the loop body to reduce overhead and increase parallelism.

Normal vs. Delayed Branch

AddressNormal BranchDelayed BranchOptimized Delayed Branch
100LOAD X, rALOAD X, rALOAD X, rA
101ADD 1, rAADD 1, rAJUMP 105
102JUMP 105JUMP 106ADD 1, rA
103ADD rA, rBNOOPADD rA, rB
104SUB rC, rBADD rA, rBSUB rC, rB
105STORE rA, ZSUB rC, rBSTORE rA, Z
106STORE rA, Z

MIPS R4000

  • RISC chip set developed by MIPS Technologies Inc.
  • Uses 64 bits for data paths, addresses, registers, and the ALU.
  • Partitioned into CPU and coprocessor for memory management.
  • Supports thirty-two 64-bit registers.
  • Provides up to 128 KB of high-speed cache.

MIPS Instruction Formats

  • I-type (Immediate): Operation, rs, rt, Immediate.
  • J-type (Jump): Operation, Target.
  • R-type (Register): Operation, rs, rt, rd, Shift, Function.

Improving Pipeline Performance

  • Superscalar: Replicates pipeline stages for parallel instruction processing.
  • Super-pipelined: Uses more fine-grained pipeline stages for increased parallelism.

R3000 Pipeline Stages

  • IF (Instruction Fetch): Translate virtual address to physical address using TLB, send physical address to instruction cache.
  • RD (Read): Return instruction from cache, decode instruction, read register file, calculate branch target address.
  • ALU: Perform arithmetic/logical operation, decide branch, calculate data virtual address, translate data virtual address to physical.
  • MEM (Memory): Send physical address to data cache.
  • WB (Write Back): Write result to register file.

R4000 Pipeline Stages

  • Instruction fetch (two halves).
  • Register file (decode, tag check, operand fetch).
  • Tag check (cache tag checks).
  • Instruction execute (ALU operation, address calculation, branch operation).
  • Data cache (two halves).
  • Write back.

SPARC (Scalable Processor Architecture)

  • Architecture defined by Sun Microsystems.
  • Inspired by Berkeley RISC 1.

SPARC Register Window Layout

  • Organized into register windows forming a circular stack.

SPARC Addressing Modes

Instruction TypeAddressing ModeAlgorithmSPARC Equivalent
Register-to-registerImmediateoperand = AS2
Load/storeDirectEA = AR0 + S2
Register-to-registerRegisterEA = RRS1, SS2
Load/storeRegister IndirectEA = (R)RS1 + 0
Load/storeDisplacementEA = (R) + ARS1 + S2

Processor Organization for Pipelining

  • Enhancements:
    • Multiple reservation stations.
    • Forwarding.
    • Reorder buffer.
  • Instruction dispatching:
    • Issue from ID to reservation station.
    • Dispatch from reservation station to FU.
  • Data forwarding addresses read-after-write (RAW) delays.
  • Reorder buffer supports out-of-order execution (OoOE).