Memory

Memory Organization Study Notes

Objectives

Master the concepts of hierarchical memory organization.
Understand how each level of memory contributes to system performance, and how the performance is measured.
Master the concepts behind cache memory, virtual memory, memory segmentation, paging, and address translation.

6.1 Introduction

Memory is a central component of the stored-program computer.
Prior chapters focused on the components of memory and how different Instruction Set Architectures (ISAs) access this memory.
This chapter will concentrate on memory organization and its impact on system performance.

6.2 Types of Memory

6.2.1 Main Memory Types

Two primary categories of main memory: Random Access Memory (RAM) and Read-Only Memory (ROM).
Within RAM, there are two main types:
- Dynamic RAM (DRAM)
- DRAM consists of capacitors that slowly leak their charge over time, necessitating refreshes every few milliseconds to prevent data loss.
- It is considered “cheap” memory due to its simple design.
- Static RAM (SRAM)
- SRAM is built using circuits similar to D flip-flops.
- It is much faster than DRAM and does not require refreshing, making it suitable for cache memory.
Read-Only Memory (ROM)
- ROM retains data without needing refreshes, storing permanent or semi-permanent data that persists even when the system is powered off.

6.3 The Memory Hierarchy

6.3.1 General Structure

Faster memory tends to be more expensive.
Memory is organized as a hierarchy to balance performance and cost.
Hierarchy Structure:
- Small, fast storage elements (registers, cache) are located in the CPU.
- Larger, slower main memory is accessed via a data bus.
- Disk and tape drives serve as large, long-term storage solutions, positioned furthest from the CPU.

6.3.2 Accessing Data

When accessing data, the CPU first checks the cache, followed by main memory, and finally disk if data isn't found.
Upon a successful data retrieval, the data plus surrounding elements are fetched into cache due to the principle of locality.

6.3.3 Key Definitions

Hit: The data is found at a given memory level.
Miss: The data is not found at the current memory level.
Hit Rate: The percentage of times data is found at the memory level.
Miss Rate: The percentage of times data is not found at the memory level; defined mathematically as:
$\text{Miss Rate} = 1 - \text{Hit Rate}$
Hit Time: Time taken to access the data at a given memory level.
Miss Penalty: Time required to handle a miss, which includes accessing a new block plus delivering data to the processor.

6.3.4 Locality Principles

Locality refers to usage patterns observed in data access, which help optimize cache usage:
- Temporal Locality: Recently accessed data is likely to be accessed again soon.
- Spatial Locality: Access patterns tend to cluster; nearby data is often needed in succession.
- Sequential Locality: Instructions are generally accessed in a sequential manner.

6.4 Cache Memory

6.4.1 Purpose and Organization

The role of cache memory is to enhance access speeds by storing frequently accessed data closer to the CPU.
While smaller than main memory, its access time is significantly shorter.
Cache memory can be accessed by content, as opposed to the address-based access of main memory, often termed content-addressable memory.

6.4.2 Mapping Schemes

Direct-Mapped Cache:
- Simplest cache mapping scheme, defined for N cache blocks.
- Block X of main memory maps to cache block Y as:
  $Y = X \mod N$
- Example: In a cache of 10 blocks, block 7 may store memory blocks 7, 17, 27, etc.

6.4.3 Cache Structure

Cache mapping involves dividing the binary main memory address into fields to determine data placement:
- Offset Field: Identifies a specific address within a block.
- Block Field: Selects a cache block.
- Tag Field: Residual bits that form the address.

6.4.4 Example of Cache Mapping

Example 6.1:
- For a byte-addressable memory with 4 blocks and a cache with 2 blocks (4 bytes/block), mapping occurs:
- Blocks 0 and 2 of main memory map to cache block 0.
- Blocks 1 and 3 of main memory map to cache block 1.

6.4.5 Replacement Policies

Policies are crucial for managing which cache blocks to evict when new data is stored:
- Least Recently Used (LRU): Evicts the block which hasn't been used for the longest time; requires history management that can be complex.
- First-In, First-Out (FIFO): Evicts oldest blocks based solely on age in cache.
- Random Replacement: Randomly evicts a block, which avoids thrashing but may displace needed data.

6.4.6 Performance Measurement

Cache memory performance is assessed through Effective Access Time (EAT): $\text{EAT} = H \times \text{AccessC} + (1 - H) \times \text{AccessMM}$ Where:
- H = Hit rate
- AccessC = Cache access time
- AccessMM = Main memory access time

6.4.7 Write Policies

Write Through: Updates cache and main memory simultaneously during each write operation; can slow performance, but is straightforward.
Write Back (Copyback): Updates main memory only when data is evicted from the cache; reduces memory traffic at the risk of consistency issues.

6.4.8 Cache Architectures

Unified Cache: Stores both instructions and data.
Harvard Cache: Uses separate caches for data and instructions to improve performance.
Victim Cache: A small associative cache that holds recently evicted blocks, boosting efficiency.
Trace Cache: Holds decoded instructions for quicker access during branching.

6.4.9 Multilevel Cache Hierarchies

Systems often implement multilevel cache hierarchies with:
- Level 1 Cache (L1): 8KB to 64KB, access time ~4ns, located on processor.
- Level 2 Cache (L2): 64KB to 2MB, access time ~15-20ns, may be on the motherboard.
- Level 3 Cache (L3): 2MB to 256MB, situated between CPU and main memory.

6.4.10 Cache Inclusion Policies

Inclusive Cache: Same data can exist at multiple cache levels.
Exclusive Cache: Only one copy of the data exists at any cache level.
The decision here impacts access time, memory size, and circuit complexity.

6.5 Virtual Memory

6.5.1 Concept

Virtual memory acts as an extension of main memory, providing more capacity without the need for increased physical RAM.
It uses disk space to swap memory pages as needed.

6.5.2 Virtual vs Physical Addresses

Physical Address: Actual memory address in RAM.
Virtual Address: Program-generated address that mapping is managed by the operating system.

6.5.3 Page Management

Main and virtual memory are divided into pages;
- Pages can be in main memory or on disk.
- Fragmentation can occur in multiple forms:
- Memory Fragmentation: Small unusable memory blocks left over by the paging process.

6.5.4 Page Table Functionality

A page table maintains page locations for active processes,
- It helps translate virtual addresses into physical addresses.

6.5.5 Page Faults

A page fault occurs when a referenced page isn't in main memory; a page must be transferred from disk to RAM.

6.5.6 Address Translation Example

Example: A virtual space of 8K and physical space of 4K translates virtual addresses into physical addresses via the page table, managing memory assignments.

6.5.7 TLB Usage

The Translation Lookaside Buffer (TLB) caches recently accessed page table entries, optimizing address lookups to mitigate access latency.
TLB misses require a reference to the page table, potentially leading to page faults.

6.5.8 Adding Segmentation

Segmentation divides memory into variable length segments, enabling easier management of varying data sizes.
Memory table structures maintain segment locations and sizes.

6.5.9 Fragmentation in Paging vs Segmentation

Internal Fragmentation: Occurs in paging when unused parts of pages remain.
External Fragmentation: Occurs in segmentation when free memory is scattered among allocated segments.

6.5.10 Address Structure in Segmentation

When combining paging and segmentation, memory addresses incorporate fields for segment, page, and offset identifiers, allowing flexible data management.

6.6 A Real-World Example

The Intel Core i9 architecture uses both paging and segmentation, showcasing combined methodologies for efficiency and effectiveness.
Multi-level caches (L1, L2, L3) help optimize access speed due to varying block sizes and proximity to CPU.

Conclusion

Memory is intricately organized in hierarchies to optimize access speed and efficiency.
Cache improves memory access time while virtual memory allows systems to utilize hard disk space as an extension of RAM.
Representations of cache laws include direct-mapped, fully associative, and set-associative,
- Each requiring specific replacement policies like LRU, FIFO, and LFU, which help manage data consistency.