Note
0.0
(0)
Rate it
Take a practice test
Chat with Kai
undefined Flashcards
0 Cards
0.0
(0)
Explore Top Notes
History What we have studied so far
Note
Studied by 7 people
5.0
(1)
Properties of Minerals and Rocks
Note
Studied by 9 people
5.0
(1)
ISD4 The Dentine Pulp Complex
Note
Studied by 5 people
5.0
(1)
NaOH Concentration Determination by Titration
Note
Studied by 1 person
5.0
(1)
Chapter 1 - What is organizational behavior?
Note
Studied by 24 people
4.5
(2)
All About the AP Psych Exam!
Note
Studied by 61 people
5.0
(1)
Home
CMPSC 311 – Caching & Memory Hierarchy
CMPSC 311 – Caching & Memory Hierarchy
Definition & Purpose of Caches
Cache (general definition)
Smaller, faster storage device that temporarily holds a
subset
of data from a larger, slower device
Acts as a
staging area
between two hierarchy levels
Memory-hierarchy principle
Level k (fast/small) serves as a cache for level (k+1) (slow/large)
Because of
locality
, programs access level k data far more often than level (k+1)
Net effect: system feels almost as fast as the top level while costing nearly as little as the bottom level
Processor Cache Levels
Modern CPUs implement
multiple on-chip levels
L1
: extremely fast, very small, located adjacent to CPU core
L2
: slower than L1, but bigger
L3
: slower/bigger still; may be shared among cores or even off-chip
Main memory (DRAM)
: slowest, cheapest per bit
Example (AMD Ryzen 5 5600X)
384\text{ KB} L1, 3\text{ MB} L2, 32\text{ MB} L3
L1 ≈ 100\times faster than DRAM; L2 ≈ 25\times faster
Instruction vs Data caches
: typically split at L1 for parallel fetch & execution paths
Visual Example of CPU Cache Sizes
Intel Core i5-3570K image illustrates distinct L1/L2/L3 regions and relative proportions (link given in slides)
Full Memory Hierarchy Recap
Ordered from
smallest ⇢ largest
(and
fastest ⇢ slowest
)
L0: CPU registers
L1: L1 cache (SRAM)
L2: L2 cache (SRAM)
L3: Main memory (DRAM)
L4: Local secondary storage (SSDs/HDDs)
L5: Remote secondary storage (network, tape, web servers)
Each level caches blocks/records from the next
Locality Principles
Spatial Locality
Future accesses likely near recently accessed address
Strategy: fetch/store data in
blocks
larger than requested byte/word
Temporal Locality
Recently accessed data likely to be accessed again soon
Strategy: retain recently used blocks longer in cache
General Cache Structure & Terminology
Cache line / block
: minimum transfer unit between levels (fixed size)
Underlying memory imagined as an array of numbered blocks
Hit
: requested block already in cache – served immediately
Miss
: requested block absent – fetched from lower level (miss penalty), then placed in cache
Basic miss-handling actions (illustrated):
Detect miss
Read block from memory
Choose placement line (placement policy)
Possibly evict old block (replacement policy)
Placement (Mapping) Policies
Fully associative
New block can go
anywhere
in cache
Pros: eliminates conflict misses; Cons: expensive hardware (many comparators)
Direct-mapped
Exactly
one
possible cache line for each memory block: line = (\text{block #} \bmod t), where t = #lines
Simple/cheap but high conflict-miss risk
n-way set associative
Cache divided into sets; each set holds n lines
Block maps to
one set
, but any of n ways within that set
Balances cost and flexibility
Types of Cache Misses
Cold / compulsory
First reference to a block – cache initially empty
Capacity
Working set size > cache size; block evicted for lack of space
Conflict
(only for direct-mapped & set-associative)
Two or more frequently used blocks map to same line or set → continuous thrashing
Example sequence 0, 8, 0, 8, … with mapping (i \bmod 4) misses every time
Slide illustration shows Level K/Level K+1 thrashing between blocks A & B
Replacement (Eviction) Policies
Least Recently Used (LRU)
Evict block unused for longest time
Least Frequently Used (LFU)
Evict block with lowest access count
First In First Out (FIFO)
Round-robin order of arrival
Performance measured via
Hit ratio
= \frac{\text{hits}}{\text{total accesses}}
Measured cost metrics (avg. latency, bandwidth)
Efficiency highly workload-dependent ("working set")
Cache Performance Metrics
Hit ratio
: fraction of memory references served from cache
Average memory access/latency time
\text{AMAT} = \text{hit time} + \text{miss ratio} \times \text{miss penalty}
where \text{miss ratio} = 1 - \text{hit ratio}
Numerical example (slide)
\text{hit time} = 25\,\mu s
\text{miss penalty} = 250\,\mu s
\text{hit ratio} = 0.8 \Rightarrow miss\,ratio = 0.2
\text{AMAT} = 25 + 0.2 \times 250 = 75\,\mu s
Worked Example: 4-Line LRU Cache
Memory access sequence (times 0-9): 1, 4, 3, 1, 5, 1, 4, 0, 3, 1
Using 4-line cache + LRU replacement
Result:
6 misses
,
4 hits
→ \text{miss ratio} = 0.6
Assume \text{hit time} = 100\,\mu s, \text{penalty} = 1000\,\mu s
\text{AMAT} = 100 + 0.6 \times 1000 = 700\,\mu s
Performance is poor due to high miss ratio caused by limited lines relative to working set
Application-Level Caching Strategies
Read-aside (Cache-aside)
Application checks cache →
hit
: return value
On miss: read from database/backing store
Manually write result into cache for future use
Gives application full control; common in many libraries
Read-through
Cache itself performs lookup to backing store on miss, stores result, returns to caller
Application sees unified interface; simplifies code
Write Policies for Caches & Databases
Write-through
Update cache
and
backing store simultaneously
Simpler, safer; good when writes are relatively infrequent
Write-back
(copy-back)
Update
only cache
on write; mark line dirty
Dirty line written to memory
later
(eviction or flush)
Reduces write traffic but complicates coherence & failure recovery
Write-around
Bypass cache; write goes directly to memory/store
Used when written data unlikely to be read again soon (avoids polluting cache)
Ethical, Practical & Design Implications
Hardware cost vs. performance trade-off: more associative caches raise silicon area & power
Predictable real-time systems may sacrifice hit ratio for determinism (direct-mapped + known latency)
Database/web engineers must choose read/write strategies balancing freshness, throughput, and consistency
Multi-core systems add complexity (cache coherence), making write-back policies more intricate
Note
0.0
(0)
Rate it
Take a practice test
Chat with Kai
undefined Flashcards
0 Cards
0.0
(0)
Explore Top Notes
History What we have studied so far
Note
Studied by 7 people
5.0
(1)
Properties of Minerals and Rocks
Note
Studied by 9 people
5.0
(1)
ISD4 The Dentine Pulp Complex
Note
Studied by 5 people
5.0
(1)
NaOH Concentration Determination by Titration
Note
Studied by 1 person
5.0
(1)
Chapter 1 - What is organizational behavior?
Note
Studied by 24 people
4.5
(2)
All About the AP Psych Exam!
Note
Studied by 61 people
5.0
(1)