3.2.2-Shared-Address-Space-Programming---Basics

Introduction to Shared Address Space Programming

Focus on programming multicore processors.
Understanding key concepts can enhance performance and program efficiency.

Processes and Address Spaces

Overview

Essential to know about:
- Processes
- Address spaces and their link to physical memory.
- Division of address space: global variables, stack, and heap.
- Operating system (OS) scheduler.
- Threads and their interactions.
- Basic challenges in shared address space programming.

Address Space Visualization

Virtual Memory components:
- Process memory viewed as an address space.
- Pages from virtual memory mapped to physical memory.
- Context switching mechanism allows the OS to manage multiple process execution.

Process Time Sharing

The process maintains a logical or virtual view of memory:
- Memory is divided internally.
- Context switches occur, saving and restoring registers, including program counter (PC) and stack pointer (SP).
- Importance of memory safety during context switching and the implications of cache pollution.

Threads in Operating Systems

Understanding Threads

To execute parallel programs:
- Use lightweight processes (LWPs) and pthreads for efficient thread management.
- Each process can host multiple pthreads sharing the same address space.
- Pthreads act like processes but are collectively controlled within a process.

Thread Management and Performance

Performance-oriented parallel programs require binding threads to specific cores to minimize cache pollution.
- Over-subscription (more pthreads than cores) should be carefully managed.
- Effects of shorter time scales and waiting threads (I/O tasks) on performance.

Simultaneous Multi-threading (SMT)

Overview

Definition of two-way simultaneous multi-threading (SMT): each core considered as two logical cores.
- Each core has distinct registers, allowing independent instruction streams.
- Shared resources such as floating-point units and caches.
- Speed advantages arise when threads can exploit latency during execution.

Hardware Hierarchy

Key Concepts

Understanding the node structure consists of:
- Multiple processors or sockets, each containing multiple cores.
- Memory hierarchy includes:
  - L1 cache (core-private)
  - L2 cache (may be private/shared)
  - L3 cache (shared across processors)
Node Connectivity: Connected by high-speed networks (Infiniband, Ethernet).