3.2.2-Shared-Address-Space-Programming---Basics
Introduction to Shared Address Space Programming
Focus on programming multicore processors.
Understanding key concepts can enhance performance and program efficiency.
Processes and Address Spaces
Overview
Essential to know about:
Processes
Address spaces and their link to physical memory.
Division of address space: global variables, stack, and heap.
Operating system (OS) scheduler.
Threads and their interactions.
Basic challenges in shared address space programming.
Address Space Visualization
Virtual Memory components:
Process memory viewed as an address space.
Pages from virtual memory mapped to physical memory.
Context switching mechanism allows the OS to manage multiple process execution.
Process Time Sharing
The process maintains a logical or virtual view of memory:
Memory is divided internally.
Context switches occur, saving and restoring registers, including program counter (PC) and stack pointer (SP).
Importance of memory safety during context switching and the implications of cache pollution.
Threads in Operating Systems
Understanding Threads
To execute parallel programs:
Use lightweight processes (LWPs) and pthreads for efficient thread management.
Each process can host multiple pthreads sharing the same address space.
Pthreads act like processes but are collectively controlled within a process.
Thread Management and Performance
Performance-oriented parallel programs require binding threads to specific cores to minimize cache pollution.
Over-subscription (more pthreads than cores) should be carefully managed.
Effects of shorter time scales and waiting threads (I/O tasks) on performance.
Simultaneous Multi-threading (SMT)
Overview
Definition of two-way simultaneous multi-threading (SMT): each core considered as two logical cores.
Each core has distinct registers, allowing independent instruction streams.
Shared resources such as floating-point units and caches.
Speed advantages arise when threads can exploit latency during execution.
Hardware Hierarchy
Key Concepts
Understanding the node structure consists of:
Multiple processors or sockets, each containing multiple cores.
Memory hierarchy includes:
L1 cache (core-private)
L2 cache (may be private/shared)
L3 cache (shared across processors)
Node Connectivity: Connected by high-speed networks (Infiniband, Ethernet).