Operating Systems Chapter 4: Threads

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/84

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

85 Terms

New cards

Thread of control

A sequence of instructions that is executed one instruction at a time, one after the other, during the execution of a program.

New cards

Thread

The set of all resources necessary to represent a single thread of control through a program's executable code.

New cards

Resources

Each thread within that process has its own:

- Unique thread ID

- Program counter (PC)

- Register set

- User stack

New cards

All threads share many process resources, but the most important are:

- the executable code, i.e. the text segment

- the heap memory, where dynamically allocated data is stored

- the data segments, containing initialized and uninitialized data

- the command line arguments, environment variables, and open files

New cards

Reasons for multi-threading

- code to handle async events can be executed by a separate thread

- threads automatically have access to the same memory address space

- on a multiprocessor architecture, many threads may run in parallel on separate cores, speeding up the computation

- performance can be improved by putting calls to system functions with expected long waits into separate threads

- by handling I/O in separate threads, the application can respond quickly, even when time-consuming operations are taking place in other threads

New cards

Concurrent

Two threads are ___ if their computations can overlap in time.

New cards

Parallel

Two threads execute in ___ if they execute at the same time on different processors.

New cards

Parallel program

Contains instruction sequences that can be executed in parallel

New cards

Concurrent program

If a parallel program is executed on a single processor, the threads within it can be interleaved in time on the processor. Therefore contains concurrency.

New cards

Challenges in Parallel Programming

Functional decomposition, load balancing, data decomposition, data dependency analysis, testing and debugging

New cards

Functional decomposition

There is no algorithm to decompose a single task into multiple, independent, parallel subtasks.

New cards

Load balancing

Decomposing a problem into subtasks in such a way that each has roughly the same amount of computation, to maximize each core's utilization.

New cards

Data decomposition

Dividing the data among the subtasks to minimize IPC and divide the work equally, and to do so in a scalable way.

New cards

Data dependency analysis

When data is distributed to the various subtasks, identifying which data in each subtask depends on the computation performed by a different subtask.

New cards

Testing and debugging

Unlike sequential software, parallel programs can have errors that depend on the relative rates of execution of the subtasks. Designing tests is very difficult and debugging when failure occur is even harder.

New cards

Data parallelism

Exists when a task can be decomposed into many subtasks that perform identical operations on different sets of data.

New cards

Task parallelism (functional parallelism)

Exists when a task can be decomposed into multiple subtasks such that each performs a different function, on the same or different data sets.

New cards

Speed-up

The ___ of a parallel program with respect to a sequential program on a computer with N identical processors with the exact same input data is the sequential program's running time on one processor divided by the parallel program's running time on N processors.

New cards

Amdahl's law

Every program has some fraction of operations in it that must be executed sequentially.

New cards

Serial fraction

Let f be the fraction of operations in a sequential program's computation on a given input that are inherently sequential. f is often called the ___.

New cards

Parallel fraction

Let N be the number of processing cores on the computer on which a parallel version of this same program is run. The speed-up of this parallel version using all N processors has an upper bound given by the formula "speedup <= (1/(f+((1-f)/N)))" The value (1-f) is defined as the ___. This represents the fraction of code that can be run in parallel.

New cards

Multithreading models

Two different methods (thread implemented in user space and in kernel space) by which computer systems allow programmers to create multithreaded programs

New cards

User level threads

Threads that are managed by a user level thread library. The library contains code for creating and destroying threads, for inter-thread communication, scheduling, saving and restoring thread contexts, and all other thread management operations.

New cards

Three most commonly used thread libraries

POSIX Threads, Windows Threads, and Java Threads

New cards

Kernel level threads

Threads that are managed entirely within the kernel. There is no thread management code in user space. Creating, destroying, scheduling, coordinating, and otherwise managing threads is performed completely within kernel code. Sometimes called a light weight process (LWP).

New cards

POSIX threads or Pthreads

One of the most common user thread libraries standardized by POSIX. Implemented in almost all modern operating systems.

New cards

pthread_exit()

Terminates the calling thread.

New cards

pthread_create()

Creates a new thread, called the child thread. The caller is the parent thread.

New cards

pthread_join()

Makes the main program wait until the child thread makes a call to pthread_exit().

New cards

User level threads: pros

- Threads can be created very quickly, depending on the underlying implementation. Few, if any, system calls are needed.

- Switching from one thread to another is also very fast since no context switch is required (bcuz all threads are in user space).

- The kernel does not need to have any support for threads for user programs to be multithreaded. All threads run in the process's context.

- Because library is in user space, code written to run against its API can be run on any computer system for which the library has been implemented.

New cards

User level threads: cons

- A process with dozens of threads may get the same amount of time on the processor as one with a single thread, depending on the implementation, so the fact that it has many threads does not give it more processor time.

- A program with multiple threads may not be able to take advantage of multiple processors, since all threads may be mapped to a single processor, depending on the implementation.

- The application programmer generally has no control over how threads are mapped to processors.

New cards

Kernel threads

Threads that run as part of the kernel, in its address space. (Inside the kernel space) Not the same thing as kernel level threads.

New cards

Tasks

Processes and threads to the Linux kernel, both represented by a task_struct. (Threads can share resources, such as their address space, whereas processes do not.)

New cards

'flags' parameter

Has many possible values that tell the kernel which resources are shared.

New cards

malloc()

Returns the starting address of the stack plus the stack size.

New cards

clone()

Tells the kernel that the parent's memory is to be shared with the child rather than copied.

New cards

SIGCHLD

Tells the kernel that the child will call exit() and the parent is going to wait() for the ___ signal to receive the child's termination status.

New cards

Kernel level threads: pros

- Kernel level threads from a single process can be scheduled simultaneously on multiple processors, taking advantage of the hardware.

- A thread that blocks as a result of a service request does not prevent other threads in the same process from being scheduled to run.

- The kernel can allocate more processor time to processes with larger numbers of threads.

- The kernel itself can be multithreaded.

New cards

Kernel level threads: cons

- It is slower to create kernel level threads and more work to manage them because there is more work in the kernel.

- Switching between threads in the same process requires kernel intervention and is therefore slower.

- Representing threads within the kernel requires a complete PCB.

New cards

Implementation of user level threads

The many-to-one model. If the kernel has support for threading: the one-to-one model and the many-to-many model

New cards

Many-to-one (M:1)

Requires no support for multithreading at the kernel level and is the most portable, but it has significant performance problems. All threads are implemented in user space within the process, which appears to the kernel to be an ordinary, single-threaded process.

New cards

Many-to-one (M:1): pros

It is portable and does not require any support from the underlying kernel.

New cards

Many-to-one (M:1): cons

- One drawback is that it requires all blocking system calls to be simulated in the library by non-blocking calls to the kernel, which slows down system calls significantly.

- A second is a consequence of the first. Some blocking system calls cannot be simulated by non-blocking system calls; as a result, when a thread makes such a call the entire process is blocked.

- A major con is that a program cannot take advantage of more than a single processor because the kernel sees it as a single-threaded process (as a single schedulable unit).

New cards

One-to-one (1:1)

Assigns a kernel level thread to each user level thread. Implementations of Pthreads in current Linux versions and Windows systems use this approach. Each thread is seen by the kernel as a separately schedulable entity.

New cards

One-to-one (1:1): pros

- It is the simplest to implement within a library.

- Provides the greatest possible concurrency because it can use as many processors as are available, up to the number of threads.

- One thread's blocking does not block other threads.

New cards

One-to-one (1:1): cons

- Creating each thread is more expensive in terms of kernel resources.

- Thread management is slower because most operations on threads require system calls.

- It is dependent on the multithreading model of the underlying kernel.

New cards

Many-to-many (M:N)

Most flexible of the models. The library creates multiple kernel level threads and schedules user level threads on top of them. Most libraries will dynamically allocate as many kernel level threads as necessary to service the user level threads that are ready to run.

New cards

Many-to-many (M:N): pros

- It does not user kernel resources for user level threads that are not actually runnable.

- The library-level scheduler can switch between threads much faster because it does not make system calls.

- It performs better than the others when user level threads synchronize with each other.

- Application that create large number of threads that only run occasionally perform better on this model than in the others.

New cards

Many-to-many (M:N): cons

- It has more overhead and consumes more system resources because scheduling takes place in both the kernel and among the kernel level threads for the process and in the library for the usre level threads.

- User level threads that are bound to the same kernel level thread can still be blocked when the thread that is running makes a system call.

New cards

Scheduler activations

The kernel provides the process with the abstraction of virtual processors.

New cards

Virtual processors

Acts like an actual processor on which a user level thread can run. Just a data structure, sometimes called Light Weight Process (LWP).

New cards

Upcalls

Kernel events that could affect the number of runnable threads, such as blocking system calls, are communicated directly to the user process using ____, which are messages sent to the user level thread manager.

New cards

Upcall handler

Handles upcalls within the thread manager (like a signal handler). Must run on a virtual processor.

New cards

Scheduler activation use example sequence

1. An executing thread makes a blocking system call.

2. The kernel blocks the calling user level thread as well as the kernel level thread used to execute the user level thread.

3. Scheduler activation: The kernel allocates a new virtual processor to the process.

4. Upcall: The kernel notifies the user level thread manager that the user level thread blocked and that a new virtual processor is available for other threads.

5. The user process runs an upcall handler on the new virtual processor. It saves the state of the blocked thread and frees the virtual processor on which that thread was running.

6. The user level thread manager moves the other threads to the new virtual processor and schedules one of the ready threads to run on it.

New cards

Implicit threading

Refers to any of several different methods of multithreading a program in which compilers and/or libraries create and manage concurrent threads with little or no explicit guidance from the programmer.

New cards

Overcome difficulties of writing multi threaded programs

- offload some of the programming tasks from developers to compilers and run-time libraries.

- implicit in implicit threading is the creation and management of threads, not the specification of parallelism.

New cards

Parallelizing compilers

Analyze the code and determine automatically where fine-grained parallelism exists. (Ex: some C++ and Fortran compilers)

New cards

Well-known implicit threading systems

OpenMP, Grand Central Dispatch, Intel Threading Building Blocks, Java Concurrency

New cards

OpenMP (Open Multi-Processing)

An API for programs written in C/C++ and FORTRAN that may be used to explicitly specify multithreaded, shared memory parallelism. It includes compiler directives, runtime library routine, and environment variables.

New cards

Grand Central Dispatch (GCD)

A technology developed by Apple for its macOS and iOS operating systems. Like OpenMP, it includes a run-time library, an API, and language extensions that allow developers to identify sections of code to run in parallel.

New cards

Intel Threading Building Blocks (TBB)

A template library that supports the design of multithreaded, shared-memory programs in C++.

New cards

Java Concurrency

Refers to the multithreading, concurrency and parallelism available from the Java Virtual Machine, which is entirely thread-based. java.util.concurrent is the class that provides this concurrency.

New cards

Worker thread

Rather than constantly creating and deleting threads, the implementation can maintain a pool of threads, like a collection of workers sitting in a room waiting to be assigned work to do. When work comes in, it gets assigned to it. When it finishes, it goes back to the waiting room.

New cards

Thread pools

To improve performance of implicit threading systems.

- A thread pool is initialized with a number of threads, where they wait for work.

- When a process needs a new thread to perform a task, it requests one from the thread pool.

- If there is an available thread in the pool, it is awakened and assigned to the process to execute the task.

- If the pool contains no available threads, the task is queued until one becomes free.

- Once a thread is available, it return to the pool and awaits more work.

New cards

Benefits of thread pools

- Servicing a request with an existing thread is faster than creating a thread.

- A thread pool limits the number of threads that exist at any one point.

- Separating the task to be performed from the mechanics of creating the task means different strategies can be used for running and scheduling the task.

New cards

Fork-join model/parallelism

Threads are created by a parent thread, and their executions reach a join point after which only one thread continues.

New cards

OpenMP Overview

- Designed for multiprocessor/core, shared memory machines. The underlying architecture can be either UMA or NUMA.

- Has 3 primary API components: Compiler directives (used by programmers to define parallel regions), Runtime library routines (which extend the language with OpenMP functions), and Environment variables (used to control the behavior of OpenMP programs)

- Parallelism is exclusively through the use of threads.

- Uses fork-join model of parallel execution.

New cards

Master thread

Executes sequentially until the first parallel region is encountered.

New cards

OpenMP key features

- Defines parallel regions as blocks of code that may run in parallel.

- Programmers insert compiler directives into their code at parallel regions; these directives instruct the OpenMP run-time library to execute the region in parallel.

- Parallel regions are specified using compiler directives, known as pragmas. (#pragma omp parallel) which specifies that the following statement is to be executed by some number of threads in parallel.

- Most data within a parallel region is shared by default (cuz its a shared memory programming model)

- #pragma omp parallel shared(a) private(i) - a C/C++ pragma that specifies the start of a parallel region with a variable 'a' shared by all threads and a thread-private variable named 'i' of which each thread has a private copy.

New cards

OpenMP Run-time Routines

The run-time library has routines for such things as:

- setting and querying the number of threads

- querying a thread's unique identifier (thread ID)

- querying if execution is within a parallel region, and at what level

- setting and querying nested parallelism

- setting, initializing and terminating locks and nested locks

- querying clock time

New cards

Grand Central Dispatch (GCD) Overview

- Provides support for programs written in various languages, including C, C++, Objective-C, and Swift.

- An implementation of task parallelism based on thread pools. Library manages the threads without the programmer's involvement. Programmer specifies the tasks that can be run in parallel.

- Task can be expressed either as a function or as a block.

- Queues the designated tasks for execution and schedules them onto an available thread to run on any available processor.

- Underlying support for libdispatch is the POSIX threads library, but most of the support comes from non-POSIX compliant Apple extensions.

New cards

Block

An extension to the syntax of C, C++, and Objective-C that encapsulates code and data into a single object. A block is specified by a caret ^ inserted in front of a pair of braces { }: ^{ print("block"); }

New cards

Signal

An empty message sent to a process because an event occurred. It has type and nothing more. May be received either synchronously or asynchronously.

New cards

Synchronous signal examples

Traps and other exceptions

New cards

Async signal examples

Timer interrupts (SIGALRM), keyboard interrupts Control-C (SIGINT), and terminal disconnections (SIGHUP)

New cards

Sequence of actions (with any type of signal)

1. A signal is generated by some event.

2. The signal is sent to a process by the kernel.

3. The signal is pending until the next step.

4. The signal is delivered to the process when the process takes some action with respect to it, which is either performing the default action, ignoring it, or catching the signal with a signal handler. The disposition of the signal is how the process behaves when the signal is delivered.

New cards

Signal handler

A function that is run when a signal is delivered. Signal handlers are registered with the kernel. If there is no user-defined signal handler, a default action is taken.

New cards

Pending flag

Indicates whether a signal type is pending or not.

New cards

Blocked flag

Indicates whether a signal type is blocked or not.

New cards

Thread cancellation

Act of terminating a thread that has not yet terminated itself. With it, one thread can try to terminate another.

New cards

Canceled thread

A thread if it is terminated.

New cards

Target thread

The thread to be canceled.

New cards

Asynchronous cancellation

A thread can cancel the target thread immediately.

New cards

Deferred cancellation

A thread attempts to cancel the target thread, but the target thread is not canceled immediately. Instead, it terminates when it reached a point in its execution when it is safe to do so.

New cards

Cancellation points

Calls to selected functions. When the thread calls one of these functions, it terminates.