CMSC 411 - Memory Architecture Notes

Memory Overview

In an ideal world, memory would be large, fast, and inexpensive simultaneously. However, trade-offs must be made in reality.

Memory Technologies

Two primary memory technologies discussed:
- SRAM (Static RAM):
- Retains data while power is supplied (static).
- Does not need refresh cycles like DRAM.
- Generally larger and faster than DRAM.
- Uses 6 transistors per bit, built with high-speed CMOS technology.
- DRAM (Dynamic RAM):
- Requires refreshment; loses data if not refreshed (dynamic).
- Smaller and slower compared to SRAM.
- Uses 1 transistor and a capacitor per bit, utilizing technology optimized for density.

DRAM Operation

DRAM Structure:
- Consists of a 2D array of memory cells organized in rows and columns.
- Uses decoders to access specific rows and columns within the array.
Read and Write Operations:
- Read Operation:
- Must first access the entire row.
- Data is read into a temporary row buffer.
- The data in the original cells is lost and must be rewritten to maintain the integrity of the memory.
- This process is known as READ-THEN-WRITE.
- Write Operation:
- Similar to read; the entire row is read, changes are made in the row buffer, and the data is written back to the original memory cells.
Refresh Operation:
- DRAM cells gradually lose their contents and need periodic reading and rewriting to avoid data loss.

DRAM Access Characteristics

Destructive Reads:
- Reading from DRAM erases the data in the original memory cells.
Row and Column Addresses:
- Access is based on a pair.
- Access patterns can be optimized leveraging locality of reference (Fast Page Mode).

Fast Page Mode (FPM)

If the same row is accessed multiple times, the open row allows faster access times by bypassing the row decoder.Promotes locality and reduces latency between accesses.

Memory Subsystem Organization

Consists of several hierarchical components:
- Channel
- DIMM (Dual In-line Memory Module)
- Rank
- Chip
- Bank

DIMM and Rank Structure

DIMM contains multiple ranks, each rank containing multiple chips:
- Rank-level parallelism allows multiple memory accesses concurrently.
Chips within a DIMM are further divided into banks, which handle memory accesses at the bank level.

Timing Terminology

Row Address Strobe (RAS): Minimum clock cycles needed to access columns after opening a row.
Column Address Strobe (CAS): Time between sending a column address and receiving data back.
Precharge (PRE): Time required to prepare a row for the next operation after closing.

Open Page vs. Close Page Policies

Open Page Policy: Keeps a row open until a conflict occurs, speeding up repeated accesses to active rows.
Close Page Policy: Closes rows after access; more efficient for varying accesses (e.g. server workloads).

Summary of Latency Components

Various components contribute to overall memory latency:
- Transfer times between CPU and controller.
- Controller queuing and scheduling delays.
- Access time and transfer times to and from DRAM.

Understanding memory architecture and its nuanced operations is crucial for optimizing performance and designing efficient systems in computer architecture.