Deadlock Detection

Deadlock Detection

Deadlock detection is a technique used in operating systems to determine if a deadlock exists. Unlike deadlock prevention or avoidance, detection allows the system to enter a deadlocked state and then attempts to recover from it.

When to Invoke a Deadlock Detection Algorithm

The frequency of invoking a deadlock detection algorithm depends on two factors:

  1. How often a deadlock is likely to occur: If deadlocks are frequent, the algorithm should be invoked more often.

  2. How many processes will be affected by a deadlock when it occurs: If many processes could be affected, frequent detection is beneficial to minimize impact.

It can be invoked:

  • Whenever a request for a resource cannot be granted immediately: This is a common approach to detect deadlocks as soon as they might form.

  • Periodically: For example, every few hours, or when CPU utilization drops below a certain threshold (indicating processes might be waiting for resources indefinitely).

Deadlock Detection for Single Instance of Each Resource Type

If there is only one instance of each resource type, a deadlock can be detected by constructing a resource-allocation graph and checking for cycles.

  • Nodes: Represent processes (P) and resource types (R).

  • Edges:

    • Request Edge: From process P<em>iP<em>i to resource R</em>jR</em>j (denoted P<em>iR</em>jP<em>i \rightarrow R</em>j) signifies P<em>iP<em>i has requested R</em>jR</em>j and is currently waiting for it.

    • Assignment Edge: From resource R<em>jR<em>j to process P</em>iP</em>i (denoted R<em>jP</em>iR<em>j \rightarrow P</em>i) signifies R<em>jR<em>j has been allocated to P</em>iP</em>i.

  • Cycle Detection: If the resource-allocation graph contains a cycle, there is a deadlock. This is an O(N2)O(N^2) operation for NN nodes.

Deadlock Detection for Multiple Instances of Each Resource Type

When there are multiple instances of each resource type, the resource-allocation graph with cycle detection is insufficient. A variant of the Banker's Algorithm can be used. This algorithm typically requires:

  • Available: A vector of length mm indicating the number of available instances of each resource type. Available[j]=kAvailable[j] = k means kk instances of resource type RjR_j are available.

  • Allocation: An n×mn \times m matrix defining the number of resources of each type currently allocated to each process. Allocation[i,j]=kAllocation[i, j] = k means process P<em>iP<em>i is currently allocated kk instances of resource type R</em>jR</em>j.

  • Request: An n×mn \times m matrix indicating the current request of each process. Request[i,j]=kRequest[i, j] = k means process P<em>iP<em>i is requesting kk instances of resource type R</em>jR</em>j.

The algorithm works as follows:

  1. Initialize Work=AvailableWork = Available and Finish[i]=falseFinish[i] = false for all i=0,,n1i = 0, \ldots, n-1.

  2. Find an index ii such that:

    • Finish[i]=falseFinish[i] = false

    • Request<em>iWorkRequest<em>i \le Work (Process P</em>iP</em>i's request can be satisfied with current available resources).
      If no such ii exists, go to step 4.

  3. If such an ii is found:

    • Work=Work+Allocation<em>iWork = Work + Allocation<em>i (Resources allocated to P</em>iP</em>i are released and added to WorkWork).

    • Finish[i]=trueFinish[i] = true

    • Go to step 2.

  4. If for all ii, Finish[i]=trueFinish[i] = true, then the system is in a safe state (no deadlock). Otherwise, if Finish[i]=falseFinish[i] = false for some ii, then process PiP_i is deadlocked. The processes for which Finish[i]=falseFinish[i] = false are deadlocked.

Deadlock Recovery

Once a deadlock is detected, a recovery mechanism must be employed to break the deadlock. Common strategies include:

1. Process Termination

This method involves eliminating deadlocked processes.

  • Terminate all deadlocked processes: This simple approach quickly breaks the deadlock but can be very costly if a lot of computation has been done by those processes.

  • Terminate one process at a time until the deadlock is eliminated: This is more gradual. After each termination, the deadlock detection algorithm is run again to check if the deadlock is resolved. This minimizes cost but involves more overhead.

When choosing which process to terminate, several factors can be considered:

  • Priority of the process.

  • How long the process has computed and how much longer it will compute.

  • Resources the process has used.

  • Resources the process needs.

  • How many processes will need to be terminated.

  • Whether the process is interactive or batch.

2. Resource Preemption

This method involves taking resources away from processes and giving them to others.

  • Selecting a victim: Identify which resources and which processes to preempt. This choice should aim to minimize the cost (e.g., selecting processes that have already done minimal work).

  • Rollback: If a resource is preempted from a process, the process must be rolled back to some safe state and restarted from that state. This requires the system to maintain information about previous states of processes. A complete rollback means aborting the process and restarting it from the beginning.

  • Starvation: Ensure that resources are not always preempted from the same process, which could lead to starvation. The rollback cost should be included in the selection of the victim.