Chapter 6: Synchronization Tools - Comprehensive Notes

Chapter 6: Synchronization Tools

This chapter discusses synchronization tools, including background concepts, the critical-section problem, Peterson’s solution, hardware support, mutex locks, semaphores, monitors, liveness, and evaluation of these tools.

Objectives

Describe the critical-section problem and illustrate a race condition.
Illustrate hardware solutions using memory barriers, compare-and-swap operations, and atomic variables.
Demonstrate how mutex locks, semaphores, monitors, and condition variables solve the critical section problem.
Evaluate tools solving the critical-section problem in different contention scenarios.

Background

Processes can execute concurrently and may be interrupted at any time.
Concurrent access to shared data may lead to data inconsistency.
Mechanisms are required to ensure orderly execution of cooperating processes to maintain data consistency.
Chapter 4 illustrated the Bounded Buffer problem, which leads to a race condition when a counter is updated concurrently by a producer and consumer.

Race Condition

Processes P0 and P1 create child processes using the fork() system call.
A race condition occurs on the kernel variable next_available_pid, which represents the next available process identifier.
Without a mechanism to prevent simultaneous access, P0 and P1 might assign the same PID to two different processes.

Critical Section Problem

Consider a system of n processes {p0, p1, …, pn-1}.
Each process has a critical section segment of code where it may be changing common variables, updating a table, writing a file, etc.
When one process is in its critical section, no other process may be in its critical section.
The critical section problem involves designing a protocol to manage this.
A process must request permission to enter its critical section in the entry section, followed by the critical section, an exit section, and a remainder section, as illustrated in the provided diagram.

Critical Section Structure

The general structure of a process $P_i$ involves an entry section, critical section, exit section, and remainder section.

Requirements for Critical-Section Problem Solutions

Mutual Exclusion: If process $P_i$ is executing in its critical section, then no other processes can be executing in their critical sections.
Progress: If no process is executing in its critical section and some processes wish to enter, the selection of the process that will enter the critical section next cannot be postponed indefinitely.
Bounded Waiting: A limit must exist on the number of times other processes can enter their critical sections after a process has requested to enter its critical section and before that request is granted.
- Assume each process executes at a non-zero speed.
- No assumptions are made about the relative speeds of the n processes.

Interrupt-based Solution

Entry section: disable interrupts.
Exit section: enable interrupts.
Issues:
- If the critical section's code runs for an extended period, it can be problematic.
- Processes might starve, never entering their critical section.
- This approach is unsuitable for systems with multiple CPUs.

Software Solution 1

Two-process solution.
Assumes load and store machine-language instructions are atomic.
Shared variable:
- int turn;
The variable turn indicates whose turn it is to enter the critical section. Initialized to $i$ .

Algorithm for Process Pi

The algorithm for process $P_i$ :

while (true) {
    while (turn == j);
    /* critical section */
    turn = j;
    /* remainder section */
}

Correctness of the Software Solution

Mutual exclusion is preserved:
- $P_i$ enters the critical section only if turn = i.
- turn cannot be both 0 and 1 simultaneously.
Progress requirement: Not guaranteed.
Bounded-waiting requirement: Not guaranteed.

Peterson’s Solution

Two-process solution.
Assumes load and store machine-language instructions are atomic.
Shared variables:
- int turn; (indicates whose turn it is to enter the critical section)
- boolean flag[2]; (indicates if a process is ready to enter the critical section)
  - flag[i] = true implies that process $P_i$ is ready.

Algorithm for Process Pi

while (true) {
    flag[i] = true;
    turn = j;
    while (flag[j] && turn == j);
    /* critical section */
    flag[i] = false;
    /* remainder section */
}

Correctness of Peterson’s Solution

The three critical section requirements are met:
1. Mutual exclusion: $P_i$ enters CS only if either flag[j] = false or turn = i.
2. Progress: Guaranteed.
3. Bounded-waiting: Guaranteed.

Peterson’s Solution and Modern Architecture

Peterson’s Solution is not guaranteed to work on modern architectures due to potential reordering of operations by processors and compilers to improve performance.
This reordering can lead to race conditions.
Reordering is acceptable for single-threaded processes but can produce inconsistent or unexpected results in multithreaded scenarios.

Modern Architecture Example

Two threads share data:
- boolean flag = false;
- int x = 0;
Thread 1:
while (!flag); print x;
Thread 2:
x = 100; flag = true;
Expected output: 100

Modern Architecture Example (Cont.)

Since flag and x are independent, the instructions in Thread 2 (flag = true; and x = 100;) may be reordered.
If reordered, the output may be 0.

Peterson’s Solution Revisited

Instruction reordering can cause Peterson’s Solution to fail, allowing both processes into their critical sections simultaneously.
To ensure correct operation on modern architectures, Memory Barriers must be used.

Memory Barrier

Memory model: memory guarantees a computer architecture makes to application programs.
Types:
- Strongly ordered: Memory modification of one processor is immediately visible to all others.
- Weakly ordered: Memory modification of one processor may not be immediately visible to all others.
A memory barrier is an instruction that forces any change in memory to be propagated (made visible) to all other processors.

Memory Barrier Instructions

When a memory barrier instruction is performed, the system ensures all loads and stores are completed before any subsequent load or store operations are performed.
Even if instructions were reordered, the memory barrier ensures that store operations are completed in memory and visible to other processors before future load or store operations are performed.

Memory Barrier Example

Using the previous example:
Thread 1:
while (!flag) memory_barrier(); print x;
Thread 2:
x = 100; memory_barrier(); flag = true;
For Thread 1, the value of flag is loaded before the value of x.
For Thread 2, the assignment to x occurs before the assignment to flag.

Synchronization Hardware

Many systems provide hardware support for implementing critical section code.
Uniprocessors:
- Could disable interrupts.
- Currently running code would execute without preemption.
- Generally too inefficient on multiprocessor systems and not broadly scalable.
Three forms of hardware support:
1. Hardware instructions
2. Atomic variables

Hardware Instructions

Special hardware instructions that allow atomic (uninterrupted) testing and modification of a word's content or swapping the contents of two words.
- Test-and-Set instruction
- Compare-and-Swap instruction

The testandset Instruction

Definition:

boolean test_and_set (boolean *target) {
    boolean rv = *target;
    *target = true;
    return rv;
}

Properties:
- Executed atomically.
- Returns the original value of the passed parameter.
- Sets the new value of the passed parameter to true.

Solution Using testandset()

Shared boolean variable lock, initialized to false.

Solution:

do {
    while (test_and_set(&lock))
        ;
    /* critical section */
    lock = false;
    /* remainder section */
} while (true);

The compare and swap Instruction

Definition:

int compare_and_swap(int *value, int expected, int new_value) {
    int temp = *value;
    if (*value == expected)
        *value = new_value;
    return temp;
}

Properties:
- Executed atomically.
- Returns the original value of the passed parameter value.
- Sets the variable value to the value of the passed parameter new_value only if *value == expected is true.

Solution using compareandswap

Shared integer lock initialized to 0.

Solution:

while (true) {
    while (compare_and_swap(&lock, 0, 1) != 0)
        ;
    /* critical section */
    lock = 0;
    /* remainder section */
}

Bounded-waiting with compare-and-swap

while (true) {
    waiting[i] = true;
    key = 1;
    while (waiting[i] && key == 1)
        key = compare_and_swap(&lock,0,1);
    waiting[i] = false;
    /* critical section */
    j = (i + 1) % n;
    while ((j != i) && !waiting[j])
        j = (j + 1) % n;
    if (j == i)
        lock = 0;
    else
        waiting[j] = false;
    /* remainder section */
}

Atomic Variables

Instructions like compare-and-swap are used as building blocks for other synchronization tools.
An atomic variable provides atomic (uninterruptible) updates on basic data types like integers and booleans.
Example:
- Let sequence be an atomic variable.
- Let increment() be an operation on sequence.
- The command increment(&sequence); ensures sequence is incremented without interruption.

Atomic Variables

The increment() function can be implemented as follows:

void increment(atomic_int *v) {
    int temp;
    do {
        temp = *v;
    } while (temp != (compare_and_swap(v,temp,temp+1));
}

Mutex Locks

Previous solutions are complicated and generally inaccessible to application programmers.
OS designers build software tools to solve the critical section problem.
Simplest tool is a mutex lock, a boolean variable indicating if the lock is available or not.
Protect a critical section by:
- First acquire() a lock.
- Then release() the lock.
Calls to acquire() and release() must be atomic, usually implemented via hardware atomic instructions such as compare-and-swap.
This solution requires busy waiting and is called a spinlock.

Solution to CS Problem Using Mutex Locks

while (true) {
    acquire lock
    critical section
    release lock
    remainder section
}

Semaphore

A synchronization tool that provides more sophisticated ways (than Mutex locks) for processes to synchronize their activities.
Semaphore S: integer variable
Accessed via two indivisible (atomic) operations: wait() and signal() (originally called $P()$ and $V()$ ).

Definition of the wait() operation:

wait(S) {
    while (S <= 0)
        ;
    S--;
}

Definition of the signal() operation:
```
signal(S) {
    S++;
}
```

Semaphore (Cont.)

Counting semaphore: integer value can range over an unrestricted domain.
Binary semaphore: integer value can range only between 0 and 1 (same as a mutex lock).
Can implement a counting semaphore $S$ as a binary semaphore.
Semaphores can solve various synchronization problems.

Semaphore Usage Example

Solution to the CS Problem:
c Create a semaphore “mutex” initialized to 1 wait(mutex); CS signal(mutex);
Consider $P1$ and $P2$ with statements $S1$ and $S2$ , respectively, and the requirement that $S1$ must happen before $S2$ :
c Create a semaphore “synch” initialized to 0 P1: S1; signal(synch); P2: wait(synch); S2;

Semaphore Implementation

Must guarantee that no two processes can execute the wait() and signal() operations on the same semaphore at the same time.
The implementation becomes the critical section problem, where the wait and signal code are placed in the critical section.
Could have busy waiting in the critical section implementation, but the implementation code is short, resulting in little busy waiting if the critical section is rarely occupied.
Applications may spend lots of time in critical sections, making this implementation less ideal.

Semaphore Implementation with no Busy waiting

Each semaphore has an associated waiting queue.
Each entry in a waiting queue has two data items:
- value (of type integer)
- pointer to the next record in the list
Two operations:
- block: place the process invoking the operation on the appropriate waiting queue.
- wakeup: remove one of the processes in the waiting queue and place it in the ready queue.

Implementation with no Busy waiting (Cont.)

Waiting queue:

typedef struct {
    int value;
    struct process *list;
} semaphore;

Implementation with no Busy waiting (Cont.)

wait(semaphore *S) {
    S->value--;
    if (S->value < 0) {
        add this process to S->list;
        block();
    }
}

signal(semaphore *S) {
    S->value++;
    if (S->value <= 0) {
        remove a process P from S->list;
        wakeup(P);
    }
}

Problems with Semaphores

Incorrect use of semaphore operations:
- signal(mutex) …. wait(mutex)
- wait(mutex) … wait(mutex)
- Omitting wait(mutex) and/or signal(mutex)
These are examples of issues that can occur when semaphores are used incorrectly.

Monitors

A high-level abstraction that provides a convenient and effective mechanism for process synchronization.
Abstract data type with internal variables accessible only by code within the procedure.
Only one process may be active within the monitor at a time.

Pseudocode syntax of a monitor:

monitor monitor-name {
    // shared variable declarations
    procedure P1 (…) { … }
    procedure P2 (…) { … }
    procedure Pn (…) {……}
    initialization code (…) { … }
}

Schematic View of a Monitor

(Diagram illustrating processes interacting with a monitor, with only one process active inside the monitor at any given time)

Monitor Implementation Using Semaphores

Variables:
```
semaphore mutex; // (initially = 1)
```

Each procedure $P$ is replaced by:

wait(mutex);
…
body of P;
…
signal(mutex);

Mutual exclusion within a monitor is ensured.

Condition Variables

condition x, y;
Two operations allowed on a condition variable:
- x.wait(): a process that invokes the operation is suspended until x.signal().
- x.signal(): resumes one of the processes (if any) that invoked x.wait(). If no x.wait() on the variable, then it has no effect on the variable.

Monitor with Condition Variables

(Diagram illustrating a monitor with condition variables and processes waiting on these conditions)

Usage of Condition Variable Example

Consider $P1$ and $P2$ that need to execute two statements $S1$ and $S2$ , respectively, and the requirement that $S1$ happen before $S2$ .
- Create a monitor with two procedures F1 and F2 invoked by $P1$ and $P2$ , respectively.
- One condition variable x initialized to 0.
- One Boolean variable done.
- F1:
  c S1; done = true; x.signal();
- F2:
  c if done = false x.wait() S2;

Monitor Implementation Using Semaphores

Variables:

semaphore mutex;    // (initially = 1)
semaphore next;     // (initially = 0)
int next_count = 0; // number of processes waiting inside the monitor

Each function $P$ will be replaced by:

wait(mutex);
…
body of P;
…
if (next_count > 0)
    signal(next)
else
    signal(mutex);

Mutual exclusion within a monitor is ensured.

Implementation – Condition Variables

For each condition variable x, we have:

semaphore x_sem;   // (initially = 0)
int x_count = 0;

The operation x.wait() can be implemented as:

x_count++;
if (next_count > 0)
    signal(next);
else
    signal(mutex);
wait(x_sem);
x_count--;

Implementation (Cont.)

The operation x.signal() can be implemented as:

if (x_count > 0) {
    next_count++;
    signal(x_sem);
    wait(next);
    next_count--;
}

Resuming Processes within a Monitor

If several processes are queued on condition variable x, and x.signal() is executed, which process should be resumed?
FCFS (First-Come, First-Served) is frequently inadequate.
Use the conditional-wait construct of the form x.wait(c) where:
- c is an integer (called the priority number).
- The process with the lowest number (highest priority) is scheduled next.

Single Resource allocation

Allocate a single resource among competing processes using priority numbers that specify the maximum time a process plans to use the resource:
```
R.acquire(t);
…
access the resource;
…
R.release;
```
Where R is an instance of type ResourceAllocator and t is the maximum time.

Single Resource allocation

Allocate a single resource among competing processes using priority numbers that specify the maximum time a process plans to use the resource
The process with the shortest time is allocated the resource first
- Let R is an instance of type ResourceAllocator (next slide)
- Access to ResourceAllocator is done via:
```
R.acquire(t);
…
access the resource;
…
R.release;
```
Where t is the maximum time a process plans to use the resource

A Monitor to Allocate Single Resource

monitor ResourceAllocator {
    boolean busy;
    condition x;
    void acquire(int time) {
        if (busy)
            x.wait(time);
        busy = true;
    }
    void release() {
        busy = false;
        x.signal();
    }
    initialization code() {
        busy = false;
    }
}

Single Resource Monitor (Cont.)

Usage:
```
acquire
…
release
```
Incorrect use of monitor operations:
- release() … acquire()
- acquire() … acquire())
- Omitting acquire() and/or release()

Liveness

Processes may have to wait indefinitely while trying to acquire a synchronization tool such as a mutex lock or semaphore.
Waiting indefinitely violates the progress and bounded-waiting criteria discussed earlier.
Liveness refers to a set of properties that a system must satisfy to ensure processes make progress.
Indefinite waiting is an example of a liveness failure.

Liveness

Deadlock: two or more processes are waiting indefinitely for an event that can be caused by only one of the waiting processes.

Let $S$ and $Q$ be two semaphores initialized to 1:

P0          P1
wait(S);    wait(Q);
wait(Q);    wait(S);
…           …
signal(S);  signal(Q);
signal(Q);  signal(S);

If $P0$ executes wait(S) and $P1$ executes wait(Q), then when $P0$ executes wait(Q), it must wait until $P1$ executes signal(Q).
However, $P1$ is waiting until $P0$ executes signal(S).
Since these signal() operations will never be executed, $P0$ and $P1$ are deadlocked.

Liveness

Other forms of deadlock:
- Starvation: indefinite blocking. A process may never be removed from the semaphore queue in which it is suspended.
- Priority Inversion: Scheduling problem when a lower-priority process holds a lock needed by a higher-priority process.
  - Solved via priority-inheritance protocol.