3.2.2-Shared-Address-Space-Programming---Basics

Introduction to Shared Address Space Programming

  • Focus on programming multicore processors.

  • Understanding key concepts can enhance performance and program efficiency.

Processes and Address Spaces

Overview

  • Essential to know about:

    • Processes

    • Address spaces and their link to physical memory.

    • Division of address space: global variables, stack, and heap.

    • Operating system (OS) scheduler.

    • Threads and their interactions.

    • Basic challenges in shared address space programming.

Address Space Visualization

  • Virtual Memory components:

    • Process memory viewed as an address space.

    • Pages from virtual memory mapped to physical memory.

    • Context switching mechanism allows the OS to manage multiple process execution.

Process Time Sharing

  • The process maintains a logical or virtual view of memory:

    • Memory is divided internally.

    • Context switches occur, saving and restoring registers, including program counter (PC) and stack pointer (SP).

    • Importance of memory safety during context switching and the implications of cache pollution.

Threads in Operating Systems

Understanding Threads

  • To execute parallel programs:

    • Use lightweight processes (LWPs) and pthreads for efficient thread management.

    • Each process can host multiple pthreads sharing the same address space.

    • Pthreads act like processes but are collectively controlled within a process.

Thread Management and Performance

  • Performance-oriented parallel programs require binding threads to specific cores to minimize cache pollution.

    • Over-subscription (more pthreads than cores) should be carefully managed.

    • Effects of shorter time scales and waiting threads (I/O tasks) on performance.

Simultaneous Multi-threading (SMT)

Overview

  • Definition of two-way simultaneous multi-threading (SMT): each core considered as two logical cores.

    • Each core has distinct registers, allowing independent instruction streams.

    • Shared resources such as floating-point units and caches.

    • Speed advantages arise when threads can exploit latency during execution.

Hardware Hierarchy

Key Concepts

  • Understanding the node structure consists of:

    • Multiple processors or sockets, each containing multiple cores.

    • Memory hierarchy includes:

      • L1 cache (core-private)

      • L2 cache (may be private/shared)

      • L3 cache (shared across processors)

  • Node Connectivity: Connected by high-speed networks (Infiniband, Ethernet).