Memory
Memory Organization Study Notes
Objectives
Master the concepts of hierarchical memory organization.
Understand how each level of memory contributes to system performance, and how the performance is measured.
Master the concepts behind cache memory, virtual memory, memory segmentation, paging, and address translation.
6.1 Introduction
Memory is a central component of the stored-program computer.
Prior chapters focused on the components of memory and how different Instruction Set Architectures (ISAs) access this memory.
This chapter will concentrate on memory organization and its impact on system performance.
6.2 Types of Memory
6.2.1 Main Memory Types
Two primary categories of main memory: Random Access Memory (RAM) and Read-Only Memory (ROM).
Within RAM, there are two main types:
Dynamic RAM (DRAM)
DRAM consists of capacitors that slowly leak their charge over time, necessitating refreshes every few milliseconds to prevent data loss.
It is considered “cheap” memory due to its simple design.
Static RAM (SRAM)
SRAM is built using circuits similar to D flip-flops.
It is much faster than DRAM and does not require refreshing, making it suitable for cache memory.
Read-Only Memory (ROM)
ROM retains data without needing refreshes, storing permanent or semi-permanent data that persists even when the system is powered off.
6.3 The Memory Hierarchy
6.3.1 General Structure
Faster memory tends to be more expensive.
Memory is organized as a hierarchy to balance performance and cost.
Hierarchy Structure:
Small, fast storage elements (registers, cache) are located in the CPU.
Larger, slower main memory is accessed via a data bus.
Disk and tape drives serve as large, long-term storage solutions, positioned furthest from the CPU.
6.3.2 Accessing Data
When accessing data, the CPU first checks the cache, followed by main memory, and finally disk if data isn't found.
Upon a successful data retrieval, the data plus surrounding elements are fetched into cache due to the principle of locality.
6.3.3 Key Definitions
Hit: The data is found at a given memory level.
Miss: The data is not found at the current memory level.
Hit Rate: The percentage of times data is found at the memory level.
Miss Rate: The percentage of times data is not found at the memory level; defined mathematically as:
Hit Time: Time taken to access the data at a given memory level.
Miss Penalty: Time required to handle a miss, which includes accessing a new block plus delivering data to the processor.
6.3.4 Locality Principles
Locality refers to usage patterns observed in data access, which help optimize cache usage:
Temporal Locality: Recently accessed data is likely to be accessed again soon.
Spatial Locality: Access patterns tend to cluster; nearby data is often needed in succession.
Sequential Locality: Instructions are generally accessed in a sequential manner.
6.4 Cache Memory
6.4.1 Purpose and Organization
The role of cache memory is to enhance access speeds by storing frequently accessed data closer to the CPU.
While smaller than main memory, its access time is significantly shorter.
Cache memory can be accessed by content, as opposed to the address-based access of main memory, often termed content-addressable memory.
6.4.2 Mapping Schemes
Direct-Mapped Cache:
Simplest cache mapping scheme, defined for N cache blocks.
Block X of main memory maps to cache block Y as:
Example: In a cache of 10 blocks, block 7 may store memory blocks 7, 17, 27, etc.
6.4.3 Cache Structure
Cache mapping involves dividing the binary main memory address into fields to determine data placement:
Offset Field: Identifies a specific address within a block.
Block Field: Selects a cache block.
Tag Field: Residual bits that form the address.
6.4.4 Example of Cache Mapping
Example 6.1:
For a byte-addressable memory with 4 blocks and a cache with 2 blocks (4 bytes/block), mapping occurs:
Blocks 0 and 2 of main memory map to cache block 0.
Blocks 1 and 3 of main memory map to cache block 1.
6.4.5 Replacement Policies
Policies are crucial for managing which cache blocks to evict when new data is stored:
Least Recently Used (LRU): Evicts the block which hasn't been used for the longest time; requires history management that can be complex.
First-In, First-Out (FIFO): Evicts oldest blocks based solely on age in cache.
Random Replacement: Randomly evicts a block, which avoids thrashing but may displace needed data.
6.4.6 Performance Measurement
Cache memory performance is assessed through Effective Access Time (EAT): Where:
H = Hit rate
AccessC = Cache access time
AccessMM = Main memory access time
6.4.7 Write Policies
Write Through: Updates cache and main memory simultaneously during each write operation; can slow performance, but is straightforward.
Write Back (Copyback): Updates main memory only when data is evicted from the cache; reduces memory traffic at the risk of consistency issues.
6.4.8 Cache Architectures
Unified Cache: Stores both instructions and data.
Harvard Cache: Uses separate caches for data and instructions to improve performance.
Victim Cache: A small associative cache that holds recently evicted blocks, boosting efficiency.
Trace Cache: Holds decoded instructions for quicker access during branching.
6.4.9 Multilevel Cache Hierarchies
Systems often implement multilevel cache hierarchies with:
Level 1 Cache (L1): 8KB to 64KB, access time ~4ns, located on processor.
Level 2 Cache (L2): 64KB to 2MB, access time ~15-20ns, may be on the motherboard.
Level 3 Cache (L3): 2MB to 256MB, situated between CPU and main memory.
6.4.10 Cache Inclusion Policies
Inclusive Cache: Same data can exist at multiple cache levels.
Exclusive Cache: Only one copy of the data exists at any cache level.
The decision here impacts access time, memory size, and circuit complexity.
6.5 Virtual Memory
6.5.1 Concept
Virtual memory acts as an extension of main memory, providing more capacity without the need for increased physical RAM.
It uses disk space to swap memory pages as needed.
6.5.2 Virtual vs Physical Addresses
Physical Address: Actual memory address in RAM.
Virtual Address: Program-generated address that mapping is managed by the operating system.
6.5.3 Page Management
Main and virtual memory are divided into pages;
Pages can be in main memory or on disk.
Fragmentation can occur in multiple forms:
Memory Fragmentation: Small unusable memory blocks left over by the paging process.
6.5.4 Page Table Functionality
A page table maintains page locations for active processes,
It helps translate virtual addresses into physical addresses.
6.5.5 Page Faults
A page fault occurs when a referenced page isn't in main memory; a page must be transferred from disk to RAM.
6.5.6 Address Translation Example
Example: A virtual space of 8K and physical space of 4K translates virtual addresses into physical addresses via the page table, managing memory assignments.
6.5.7 TLB Usage
The Translation Lookaside Buffer (TLB) caches recently accessed page table entries, optimizing address lookups to mitigate access latency.
TLB misses require a reference to the page table, potentially leading to page faults.
6.5.8 Adding Segmentation
Segmentation divides memory into variable length segments, enabling easier management of varying data sizes.
Memory table structures maintain segment locations and sizes.
6.5.9 Fragmentation in Paging vs Segmentation
Internal Fragmentation: Occurs in paging when unused parts of pages remain.
External Fragmentation: Occurs in segmentation when free memory is scattered among allocated segments.
6.5.10 Address Structure in Segmentation
When combining paging and segmentation, memory addresses incorporate fields for segment, page, and offset identifiers, allowing flexible data management.
6.6 A Real-World Example
The Intel Core i9 architecture uses both paging and segmentation, showcasing combined methodologies for efficiency and effectiveness.
Multi-level caches (L1, L2, L3) help optimize access speed due to varying block sizes and proximity to CPU.
Conclusion
Memory is intricately organized in hierarchies to optimize access speed and efficiency.
Cache improves memory access time while virtual memory allows systems to utilize hard disk space as an extension of RAM.
Representations of cache laws include direct-mapped, fully associative, and set-associative,
Each requiring specific replacement policies like LRU, FIFO, and LFU, which help manage data consistency.