CSCI 223 Final Exam Review

Studied by 7 people

0.0(0)

LearnA personalized and smart learning plan

Practice TestTake a test on your terms and definitions

Spaced RepetitionScientifically backed study method

Matching GameHow quick can you match all your cards?

FlashcardsStudy terms and definitions

1 / 153

There's no tags or description

Looks like no one added any tags here yet for you.

154 Terms

Volatile Memory

Requires power, faster memory access eg. RAM

New cards

Non-volatile Memory(NVM)

Retains information without power, used for long term storage eg. ROM,Flash memory, SSD

New cards

2 types of RAM

SRAM and DRAM

New cards

SRAM

higher costing than DRAM, faster, and overall better, 4-6 transistors per bit

New cards

DRAM

Lower costing than SRAM, overall worse performance, 1 transistor per bit

New cards

What form of memory do cache’s use

SRAM(more efficient, more costly)

New cards

SDRAM

Synchronous DRAM uses a conventional clock signal, allows reuse of the same row addresses

New cards

DDR SDRAM

Double Data-Rate Synchronous DRAM uses double edge clocking which sends two bits per cycle per pin. Standard for most modern computer systems

New cards

Solution to CPU Memory Performance Gap

Memory Heirarchy

New cards

Locality: software or hardware solution?

Software

New cards

Locality

helps with CPU memory performance gap, programs tend to use data and instructions with addresses near or equal to those they have used recently

New cards

Memory Performance Gap

a problem where the CPU waits for memory to return data/instruction

New cards

Temporal Locality

Recently referenced items are likely to be referenced again in the near future

New cards

Spatial Locality

Items with nearby addresses tend to be referenced close together in time

New cards

What data type do registers hold

words

New cards

What data type do cache’s hold

cache lines

New cards

What data type do off chip memory formats hold

pages

New cards

Registers and Caches are

on chip

New cards

Main Memory, Local Storage, and Remote Storage(cloud) are

off chip

New cards

Benefit of 3rd Cache

reduces the miss penalty

New cards

words are how many bytes

4 or 8

New cards

cache lines are how many bytes

New cards

pages are how many bytes

4KBytes

New cards

Cache Memory

small, fast, SRAM

New cards

Cache memories and main memory are particitioned into

blocks called cache line or cache block of equal size

New cards

3 Types of Cache Misses

Cold(compulsory), Conflict, Capacity

New cards

Cold Cache Miss

occur because the cache is empty at the beginning of program execution

New cards

Conflict Cache Miss

Two or more memory locations map to the same cache set. As a result, it may find that the cache set is already occupied by the data from one of the conflicting addresses

New cards

Capacity Cache Miss

Occurs when the set of active cache blocks is larger than the cache(when program needs more cache blocks than can fit in the cache)

New cards

Direct Mapped Cache

1 cache line per set

New cards

E-way set Associative Cache

E cache lines per set

New cards

How to determine which cache line to access in an associative/E-way cache

compare tag bits

New cards

how to determine what data in a cache line needs to be accessed

offset and data type (short with offset of 0 is index 0 and 1)

New cards

Fully Associative Cache

All cache lines in a single set so there is no set index

New cards

Which cache is seperated into d-cache and i-cache

New cards

d-cache

data cache (half of L1 cache)

New cards

i-cache

instruction cache(half of L1 cache)

New cards

Is Main memory on or off chip

off chip

New cards

What cache miss does a fully associative cache not have

Conflict Miss

New cards

The L3 Cache Is

a shared last level cache

New cards

L1 access time

4 cycles

New cards

L2 access time

11 cycles

New cards

L3 access time

30-40 cycles

New cards

L1 and L2 cache’s are both

8-way caches

New cards

L3 is what type of E-way cache

16-way

New cards

Which block is replaced when there are multiple victim candidates

Least Recently Used Block

New cards

advantages of splitting L1 cache

helps with locality and allows data and instructions to be sent at the same time

New cards

miss rate equation

1-(miss rate)

New cards

Hit Time

time it takes to deliver a line from cache to processor

New cards

Miss Penalty

the additional required time because of a miss

New cards

typical L1 hit time

1-2 clock cycles

New cards

typical L2 hit time

5-20 clock cycles

New cards

typical miss penalty

50-200 cycles for main memory

New cards

Average Memory Access Time Equation

AMAT=Hit time+(miss rate*miss penalty)

New cards

99% cache hits is twice as good as 97% T/F

True

New cards

3 ways to optimize cache

reduce miss rate, miss penalty, and hit time

New cards

advantages of increasing cache block size

reduces miss rate

New cards

disadvantages of increasing cache block size

increases miss penalty and conflict/capacity misses if cache is small

New cards

advantages of larger cache

reduces capacity misses

New cards

disadvantages of larger cache

longer hit time, higher cost and power

New cards

advantage of higher associativity

reduce conflict misses

New cards

advantage of multilevel caches

reduces miss penalty

New cards

cache stride pattern

the distance b/w consecutive accesses eg. Stride-1:A[0]→A[1]→A[2]→A[3]…

New cards

key idea of writing cache friendly code

our qualitative notion of locality is quantified through our understanding of cache memories

New cards

Writing cache friendly code:

90/10 rule, focus on inner loops of core functions(loop unrolling), minimize misses in inner loops, repeated references to data are good(temporal locality), Stride-1 reference patterns are good(spatial locality)

New cards

90/10 rule

90% of execution time is spent on the most costly 10% of the program

New cards

Benefits of Virtual Memory

Makes programming much easier, uses DRAM as a cache, simplifies memory management, isolates address spaces(easier memory protection)

New cards

Virtual Memory

an array of N contiguous bytes used while compiling programs. programs stored on disk.

New cards

T/F Disk is about 10,000x slower than DRAM

True

New cards

Enormous page fault penalty for data movement b/w where

main memory and disk

New cards

Page Table

an array of page table entries(PTEs) that maps virtual pages to physical pages

New cards

Page Hit

physical main memory has a page that CPU requests

New cards

Page Fault

Physical main memory does NOT have a page the CPU requests

New cards

T/F page fault causes an exception

True

New cards

In case of page fault what happens

victim is evicted, and offending instruction is restarted

New cards

Why does virtual memory work

Locality

New cards

Working Set

a set of active virtual pages that programs tend to access

New cards

if (working set size < main memory size)

good performance after compulsory misses

New cards

if(working set size > main memory size)

bad performance with capacity misses

New cards

worst case for virtual memory locality

Thrashing: performance meltdown where pages are swapped in and out continuously

New cards

Key idea of virtual memory

each process has its own virtual address space

New cards

T/F Mapping function scatters addresses through physical memory

True

New cards

Memory allocation

each virtual page can be mapped to any physical page

New cards

T/F a virtual page cannot be stored in different physical pages at different times

False

New cards

Mapping virtual pages to the same physical page allows for

multiple processes to access the same code

New cards

Steps of address translation for page hit

1: processor sends virtual address to MMU

2-3: MMU fetches PTE from page table in memory

4: MMU sends physical address to cache/memory

5: Cache/memory sends data word to processor

New cards

Steps of address translation for page fault

1: Processor sends virtual address to MMU

2-3: MMU fetches PTE from page table in memory

4: Valid bit is zero, so MMU triggers page fault exception

5: Handler identifies victim (and, if dirty, pages it out to disk)

6: Handler pages in new page and updates PTE in memory

7: Handler returns to original process, restarting faulting instruction

New cards

TLB

Translation Lookaside Buffer: small hardware cache in MMU that contains complete page table entries(PTEs) for small number of pages

New cards

purpose of TLB

speeds up translation by eliminating a memory access for the most used pages

New cards

consequence of TLB

if TLB miss, has to access TLB AND main memory

New cards

T/F TLB misses are very common

False

New cards

Programmer’s view of virtual memory

each process has its own private linear address space that cannot be corrupted by other processes

New cards

System view of virtual memory

uses memory efficiently by caching virtual memory pages(efficient because of locality), simplifies memory management and programming, simplifies protection by providing a convenient inter-positioning point to check permissions

New cards

Pipeline Speedup Equation

Pipelined Execution Time = Non-Pipelined Execution Time / number of stages

New cards

maximum speedup of pipeline

number of stages

New cards

3 Hazard types for pipeline

Structural Hazard, Data Hazard, Control Hazard

New cards

Structural Hazard

a required resource does not exist or is busy