Lecture_16-CSCI_U310_01-Jahangir_Majumder-Spring_2025

Introduction to Computer Architecture

Course Title:

CSCI U310 01 Introduction to Computer ArchitectureInstructor: AKM Jahangir A Majumder, PhDDate: Spring 2025 - Lecture 16 March 18, 2025Note: Slides adapted from previous instructors and textbook images.

Course Review and Learning Outcomes

Today’s Focus: Memory Technology

Discuss Memory Technology
Direct-Mapped Cache Terminology
Memory Access with and without Cache
Cache Performance

Administrative Notes:

Quiz 4 grades and key are available on Blackboard.
Homework 5 is posted on Blackboard, due today by 5 PM.
Upcoming Quiz 5 on Thursday, March 20 (covering lectures 13-14 part).

Direct-Mapped Cache

Cache Location Allocation:

The cache location 0 can hold data from memory locations that are multiples of 4 (e.g., 0, 4, 8). This establishes a pattern where specific memory addresses are mapped to designated cache slots.

Example of Cache Addressing:

In a Direct-Mapped Cache with 4 blocks, memory locations are segmented into indices that represent specific slots within the cache where data can be stored. The addressing format allows for efficient retrieval of stored data.

Direct-Mapped Cache Terminology

Key Elements of Cache:

Index: Indicates the cache index to be checked to find corresponding data. Each index corresponds to a specific cache line.
Offset: Specifies which byte within the block is being accessed after the correct block has been located in cache.
Tag: Remaining bits used to identify all memory addresses that map to the same cache location, ensuring that data retrieval is accurate even when multiple addresses share the same index.

Format Representation:

Tag (tttt...) | Index (iiii...) | Offset (oooo)

Tags and Valid Bits

Purpose of Tags:

Tags are critical as they store the block addresses alongside the cache data, allowing the cache to differentiate between different data that may share the same index. Tags consist of the high-order bits necessary for this identification.

Valid Bits:

Valid bits are used to determine if the data in a cache location is valid or not, where 1 indicates data is present and 0 indicates that no valid data is stored in that particular cache line, with defaults set to 0 upon initialization.

Direct-Mapped Cache Example

Cache Specifications (32-bit Architecture)

16KB Cache with 4 Word Blocks:

Offset Calculation:

Each block consists of 4 words = 16 bytes, which requires 4 bits for addressing (2^4 = 16), allowing for effective selection of bytes within each block of data.

Index Calculation:

Cache capacity = 16KB (2^14 bytes).
The total number of blocks is derived as follows:
- Number of blocks = 16KB / 24 bytes (4 words) = 2^10 blocks (1024 blocks).
- Therefore, 10 bits are required for row specification in the cache.

Tag Calculation:

The length of the tag is calculated as:
- Tag length = Total address length - Offset - Index = 32 - 4 - 10 = 18 bits.
The leftmost 18 bits of the memory address serve as the tag, used to uniquely identify the data block stored in the cache.

Cache Organization Mnemonic

Dan’s Cache Mnemonic:

AREA (cache size, B) = HEIGHT (# of blocks) * WIDTH (size of one block, B/block)

Relationship:

The width and height directly impact the overall design and efficiency of the cache. The AREA illustrates how these dimensions affect cache performance and capacity, thereby influencing overall system speed.

Memory Access Without Cache

Process Overview:

When a Load Word instruction (e.g., lw $t0, 0($t1), where $t1 contains the address 102210) is executed, the processor interacts directly with the main memory.

Memory Interaction Steps:

The processor issues the address 102210 to the memory subsystem.
Memory retrieves the word at that particular address.
The retrieved data is sent back to the processor.
Finally, the processor loads this value into register $t0 for further computations.

Memory Access With Cache

Cache Enhanced Loading Instruction:

Steps when accessing data via cache:

The processor issues the address to the cache instead of memory.
The cache checks for a match:
- If Hit: Data is successfully retrieved from the cache and sent directly to the processor (e.g., sending the value 99 directly to $t0).
- If Miss: The cache then communicates with memory to load the required data:
  1. Memory reads the necessary data.
  2. Data is sent back to the cache, updating the cache with new information.
  3. The cache subsequently sends this data to the processor.

Cache Performance Metrics

Hit Rate: Defined as the fraction of memory accesses that successfully locate the required data in the cache, which is a critical metric for measuring cache effectiveness.
Miss Rate: Calculated as 1 - Hit Rate, revealing how often data that is needed is not found in the cache.
Miss Penalty: Refers to the time required to replace a block from a lower storage level to the cache, highlighting the delay in data retrieval when misses occur.
Hit Time: The time necessary to access the cache, which includes the overhead for tag comparison, and contributes significantly to system performance.

Address Structure in Cache

Subdivision Example:

Provides a clear breakdown of memory addresses, specifying positions for the tag, index, and offset, aiding in understanding cache access patterns.

Example Memory Access:

Illustrations of bit positions to visualize how addresses are processed within the cache.

Block Size Considerations

Effect of Block Size:

Utilizing larger blocks can reduce miss rates by taking advantage of spatial locality, where nearby data is often accessed together.

Challenges include:

Fixed-sized cache limits the number of blocks, leading to competition and inhibiting performance gains.
Larger cache sizes may lead to increased pollution in terms of cached data and miss penalties, countering the advantages of having larger blocks.

Solutions:

Employ strategies such as early restart and critical-word-first to mitigate potential issues arising from larger block sizes.

Accessing Data in Direct Mapped Cache

Example Accesses:

Data can be accessed through the following memory addresses:
- 0x00000014, 0x0000001C, 0x00000034, 0x00008014.

Breakdown of Addressing:

Each hexadecimal address converted into tag, index, and byte offset fields for processing, ensuring that the memory hierarchy is comprehensively utilized to achieve efficient data retrieval.

Summary of Key Concepts

Memory Hierarchy Importance:

Critical for increasing processing speed. By keeping frequently used data in faster but more expensive cache layers, systems can optimize performance significantly.

Future Discussions:

Further exploration of performance issues regarding cache efficiency and its impact on overall system architecture will continue as we advance in this course.