Memory

Memory Organization Study Notes

Objectives

  • Master the concepts of hierarchical memory organization.

  • Understand how each level of memory contributes to system performance, and how the performance is measured.

  • Master the concepts behind cache memory, virtual memory, memory segmentation, paging, and address translation.

6.1 Introduction

  • Memory is a central component of the stored-program computer.

  • Prior chapters focused on the components of memory and how different Instruction Set Architectures (ISAs) access this memory.

  • This chapter will concentrate on memory organization and its impact on system performance.

6.2 Types of Memory

6.2.1 Main Memory Types
  • Two primary categories of main memory: Random Access Memory (RAM) and Read-Only Memory (ROM).

  • Within RAM, there are two main types:

    • Dynamic RAM (DRAM)

    • DRAM consists of capacitors that slowly leak their charge over time, necessitating refreshes every few milliseconds to prevent data loss.

    • It is considered “cheap” memory due to its simple design.

    • Static RAM (SRAM)

    • SRAM is built using circuits similar to D flip-flops.

    • It is much faster than DRAM and does not require refreshing, making it suitable for cache memory.

  • Read-Only Memory (ROM)

    • ROM retains data without needing refreshes, storing permanent or semi-permanent data that persists even when the system is powered off.

6.3 The Memory Hierarchy

6.3.1 General Structure
  • Faster memory tends to be more expensive.

  • Memory is organized as a hierarchy to balance performance and cost.

  • Hierarchy Structure:

    • Small, fast storage elements (registers, cache) are located in the CPU.

    • Larger, slower main memory is accessed via a data bus.

    • Disk and tape drives serve as large, long-term storage solutions, positioned furthest from the CPU.

6.3.2 Accessing Data
  • When accessing data, the CPU first checks the cache, followed by main memory, and finally disk if data isn't found.

  • Upon a successful data retrieval, the data plus surrounding elements are fetched into cache due to the principle of locality.

6.3.3 Key Definitions
  • Hit: The data is found at a given memory level.

  • Miss: The data is not found at the current memory level.

  • Hit Rate: The percentage of times data is found at the memory level.

  • Miss Rate: The percentage of times data is not found at the memory level; defined mathematically as:
    Miss Rate=1Hit Rate\text{Miss Rate} = 1 - \text{Hit Rate}

  • Hit Time: Time taken to access the data at a given memory level.

  • Miss Penalty: Time required to handle a miss, which includes accessing a new block plus delivering data to the processor.

6.3.4 Locality Principles
  • Locality refers to usage patterns observed in data access, which help optimize cache usage:

    • Temporal Locality: Recently accessed data is likely to be accessed again soon.

    • Spatial Locality: Access patterns tend to cluster; nearby data is often needed in succession.

    • Sequential Locality: Instructions are generally accessed in a sequential manner.

6.4 Cache Memory

6.4.1 Purpose and Organization
  • The role of cache memory is to enhance access speeds by storing frequently accessed data closer to the CPU.

  • While smaller than main memory, its access time is significantly shorter.

  • Cache memory can be accessed by content, as opposed to the address-based access of main memory, often termed content-addressable memory.

6.4.2 Mapping Schemes
  • Direct-Mapped Cache:

    • Simplest cache mapping scheme, defined for N cache blocks.

    • Block X of main memory maps to cache block Y as:
      Y=XmodNY = X \mod N

    • Example: In a cache of 10 blocks, block 7 may store memory blocks 7, 17, 27, etc.

6.4.3 Cache Structure
  • Cache mapping involves dividing the binary main memory address into fields to determine data placement:

    • Offset Field: Identifies a specific address within a block.

    • Block Field: Selects a cache block.

    • Tag Field: Residual bits that form the address.

6.4.4 Example of Cache Mapping
  • Example 6.1:

    • For a byte-addressable memory with 4 blocks and a cache with 2 blocks (4 bytes/block), mapping occurs:

    • Blocks 0 and 2 of main memory map to cache block 0.

    • Blocks 1 and 3 of main memory map to cache block 1.

6.4.5 Replacement Policies
  • Policies are crucial for managing which cache blocks to evict when new data is stored:

    • Least Recently Used (LRU): Evicts the block which hasn't been used for the longest time; requires history management that can be complex.

    • First-In, First-Out (FIFO): Evicts oldest blocks based solely on age in cache.

    • Random Replacement: Randomly evicts a block, which avoids thrashing but may displace needed data.

6.4.6 Performance Measurement
  • Cache memory performance is assessed through Effective Access Time (EAT): EAT=H×AccessC+(1H)×AccessMM\text{EAT} = H \times \text{AccessC} + (1 - H) \times \text{AccessMM} Where:

    • H = Hit rate

    • AccessC = Cache access time

    • AccessMM = Main memory access time

6.4.7 Write Policies
  • Write Through: Updates cache and main memory simultaneously during each write operation; can slow performance, but is straightforward.

  • Write Back (Copyback): Updates main memory only when data is evicted from the cache; reduces memory traffic at the risk of consistency issues.

6.4.8 Cache Architectures
  • Unified Cache: Stores both instructions and data.

  • Harvard Cache: Uses separate caches for data and instructions to improve performance.

  • Victim Cache: A small associative cache that holds recently evicted blocks, boosting efficiency.

  • Trace Cache: Holds decoded instructions for quicker access during branching.

6.4.9 Multilevel Cache Hierarchies
  • Systems often implement multilevel cache hierarchies with:

    • Level 1 Cache (L1): 8KB to 64KB, access time ~4ns, located on processor.

    • Level 2 Cache (L2): 64KB to 2MB, access time ~15-20ns, may be on the motherboard.

    • Level 3 Cache (L3): 2MB to 256MB, situated between CPU and main memory.

6.4.10 Cache Inclusion Policies
  • Inclusive Cache: Same data can exist at multiple cache levels.

  • Exclusive Cache: Only one copy of the data exists at any cache level.

  • The decision here impacts access time, memory size, and circuit complexity.

6.5 Virtual Memory

6.5.1 Concept
  • Virtual memory acts as an extension of main memory, providing more capacity without the need for increased physical RAM.

  • It uses disk space to swap memory pages as needed.

6.5.2 Virtual vs Physical Addresses
  • Physical Address: Actual memory address in RAM.

  • Virtual Address: Program-generated address that mapping is managed by the operating system.

6.5.3 Page Management
  • Main and virtual memory are divided into pages;

    • Pages can be in main memory or on disk.

    • Fragmentation can occur in multiple forms:

    • Memory Fragmentation: Small unusable memory blocks left over by the paging process.

6.5.4 Page Table Functionality
  • A page table maintains page locations for active processes,

    • It helps translate virtual addresses into physical addresses.

6.5.5 Page Faults
  • A page fault occurs when a referenced page isn't in main memory; a page must be transferred from disk to RAM.

6.5.6 Address Translation Example
  • Example: A virtual space of 8K and physical space of 4K translates virtual addresses into physical addresses via the page table, managing memory assignments.

6.5.7 TLB Usage
  • The Translation Lookaside Buffer (TLB) caches recently accessed page table entries, optimizing address lookups to mitigate access latency.

  • TLB misses require a reference to the page table, potentially leading to page faults.

6.5.8 Adding Segmentation
  • Segmentation divides memory into variable length segments, enabling easier management of varying data sizes.

  • Memory table structures maintain segment locations and sizes.

6.5.9 Fragmentation in Paging vs Segmentation
  • Internal Fragmentation: Occurs in paging when unused parts of pages remain.

  • External Fragmentation: Occurs in segmentation when free memory is scattered among allocated segments.

6.5.10 Address Structure in Segmentation
  • When combining paging and segmentation, memory addresses incorporate fields for segment, page, and offset identifiers, allowing flexible data management.

6.6 A Real-World Example

  • The Intel Core i9 architecture uses both paging and segmentation, showcasing combined methodologies for efficiency and effectiveness.

  • Multi-level caches (L1, L2, L3) help optimize access speed due to varying block sizes and proximity to CPU.

Conclusion

  • Memory is intricately organized in hierarchies to optimize access speed and efficiency.

  • Cache improves memory access time while virtual memory allows systems to utilize hard disk space as an extension of RAM.

  • Representations of cache laws include direct-mapped, fully associative, and set-associative,

    • Each requiring specific replacement policies like LRU, FIFO, and LFU, which help manage data consistency.