Disk Management and RAID Structures

Magnetic Disk Overview

Magnetic disks are the primary secondary storage for modern computers.
Disk drives rotate rapidly, with speeds ranging from 60 to 250 rotations per second.
Transfer rate: The speed at which data moves between the drive and the computer.
Positioning time (random-access time): The time required to move the disk arm to the correct cylinder (seek time) plus the time it takes for the desired sector to rotate under the disk head (rotational latency).
Disks can be removable.
Drives connect to the computer via an I/O bus.
Common bus types include ATA, SATA, USB, and Fibre Channel.
The host controller in the computer uses the bus to communicate with the disk controller, which is built into the drive or storage array.

Disk Scheduling

Efficiency is crucial for I/O devices due to the speed difference between the processor and the devices.
Disk scheduling is a key part of disk management, prioritizing I/O requests to the disk.
Disk scheduling algorithms, disk formatting, handling bad sectors, swap-space management, and RAID structures are all important aspects of disk management.

Importance of Disk Scheduling

Disk scheduling optimizes I/O requests, especially in multi-programming environments.
In multi-programming, multiple processes may request I/O operations on the disk.
Since the processor can only handle one I/O request at a time, other requests wait in a disk queue.
The disk arm's current position affects access time; requests far from the current position increase seek time.

Addressing Random Requests

Disk scheduling handles random requests from users, such as those from web servers and database servers.
Requests can vary in size and may not be on the same cylinder.
Scheduling optimizes request order to improve system performance.

Disk-Scheduling Criteria

Disk I/O is crucial, and scheduling optimizes disk arm movement to efficiently service I/O requests.
Logical blocks are translated into physical disk blocks (cylinder, track, sector).

Key Criteria for Disk-Scheduling Algorithms

Seek Time:
- Time taken by the disk head to move between cylinders.
- Dependent on seek length (distance between current and next head position).
- Example: Moving from cylinder 13 to 34 has a seek length of 21.
- Average seek time is calculated based on the average time to move between cylinders.
Seek Time Formula:
- $Seek(n) = \begin{cases} 0 & \text{if } n = 0 \ a + bn & \text{if } n > 0 \end{cases}$
- Where:
  - $n$ is the seek distance.
  - $a$ is the mechanical settling time.
  - $b$ is the acceleration factor.
- Algorithms with minimal average seek time are preferred.
Rotational Latency:
- Time for the addressed sector to rotate under the read/write head.
- Algorithms minimizing rotational latency are superior.
Transfer Time:
- Time for data transfer between disk and application.
- Depends on the disk's rotational speed and the amount of data transferred.
Disk Access Time:
- Total time to perform a disk operation.
- Sum of seek time, rotational latency, and transfer time.
Disk Bandwidth:
- Total bytes transferred divided by total time between first request and last data transfer.
- Algorithms providing high bandwidth are preferred
Throughput:
- Total disk requests serviced per unit of time.
- Algorithms should maximize throughput.
Response Time:
- Average time a request spends waiting.
- Predictable response time is crucial.
- Two parameters: average response time and variance of response time.
Variance of Response Time:
- Measure of how individual requests are serviced relative to average system performance.
- Low variance indicates consistent service times, reflecting predictability and fairness.
- Algorithms should minimize variance to ensure consistent service.

FCFS (First-Come, First-Served) Disk-Scheduling Algorithm

Simplest algorithm; requests are serviced in the order they arrive.
Ensures fairness, preventing indefinite postponement.
May not be optimal due to potentially high seek times and low throughput if the first request is far from the current head position.
Performance is good when disk load is low and requests are uniformly distributed.
Performance degrades with increased disk load due to higher average response time.
Provides low variance of response time because of random seek patterns.

FCFS Example

Disk queue with I/O requests for cylinders: 54, 97, 73, 128, 15, 44, 110, 34, 45.
Disk head starts at Cylinder 23.
The head moves sequentially through the requests: 23 -> 54 -> 97 -> 73 -> 128 -> 15 -> 44 -> 110 -> 34 -> 45.
Inefficient due to large head movements (e.g., 73 to 128 and 128 to 15).

SSTF (Shortest Seek Time First) Disk-Scheduling Algorithm

Services the request with the shortest seek time first.
Calculates seek time for all requests, selects the nearest one, and schedules it.
Improves system performance by servicing nearby requests first.
Increases throughput and reduces average response time compared to FCFS.
Incurs computational cost to calculate seek times.
Can cause starvation for requests far from the current head position if continuous requests with shorter seek times arrive.
Concentrated requests in one region are served more, leading to starvation elsewhere and high variance in response time.
Best suited for batch-processing systems needing high throughput and average response time.
Potentially inefficient for interactive and online systems requiring predictable, low response times.

SSTF Example

Head movement is based on the shortest seek time.
The next request is determined by the shortest distance from the current head position.
The head movement is drastically reduced compared to FCFS.
The head changes direction less frequently, improving system performance.
If a new request arrives at Cylinder 75 while processing Cylinder 73, the head moves to 75 due to the shortest seek length.
This can lead to starvation for requests like 110 and 128 if shorter seek length requests continue to arrive.

SCAN Disk-Scheduling Algorithm

The disk arm moves in one direction (outermost to innermost cylinder) and reverses direction upon reaching the end of the disk.
Services requests in the direction of head movement until the end of the disk is reached, then reverses direction.
The preferred direction must be determined initially, servicing all requests in that direction.
New requests arriving in the path of the head are also serviced.
Known as the "elevator algorithm" due to the analogy of servicing floors in one direction and then reversing.
Provides high throughput and average response time, similar to SSTF.
Offers lower variance of response time compared to SSTF by servicing all requests in a given direction.
Midrange tracks are visited more frequently than outer or inner tracks.
New requests arriving in front of the head are serviced, potentially causing starvation for requests behind the head.

SCAN Example

Disk queue with I/O requests: 6, 10, 12, 54, 97, 73, 128, 15, 44, 110, 34, 45.
The disk head is at Cylinder 23, moving towards decreasing cylinder numbers. The disk has 150 cylinders.
The head services requests 15, 12, 10, and 6, then moves to Cylinder 0.
It reverses direction and services requests until Cylinder 150.
Total head movement: 23 + 150 = 173.
Movements from Cylinders 6 to 0 and 128 to 150 are unnecessary as there are no requests in those paths.
If new requests arrive at Cylinders 60, 65, and 70 while processing Cylinder 54, they are serviced immediately after 54.
These new requests increase waiting time for pending requests.

C-SCAN (Circular SCAN) Disk-Scheduling Algorithm

In SCAN, the head reverses direction after reaching the end of the disk, re-scanning serviced requests.
C-SCAN overcomes drawbacks of SCAN by having the head return to the beginning of the disk without servicing requests and then scanning in one direction.
Treats cylinders as a circular queue, scanning repeatedly in one direction.
Provides high throughput similar to SCAN.
Offers lower variance in response time compared to SCAN, as jobs on the other side of the disk are executed sooner, and center cylinders are not favored.
Starvation can still occur if new requests arrive on the same cylinder or in front of the head.

C-SCAN Example

The disk head reaches Cylinder 0, returns to Cylinder 150, and services requests in its path.
Total head movement is reduced to 139 compared to 173 in SCAN.

F-SCAN and N-step SCAN Disk-Scheduling Algorithms

Addresses the starvation problem that occurs when new I/O requests arrive while the disk head is processing a request.
Variations of the SCAN algorithm.

F-SCAN

Uses two queues: one for arriving requests, and another initially empty.
The disk arm services requests from the first queue.
New requests arriving while processing the first queue are placed in the second queue.
The second queue requests are deferred until all pending requests from the first queue are processed.
Freezing of requests continues until the disk head reaches the end of the disk.
As the disk head reaches the last cylinder and changes direction, the second queue is merged with the first queue and sorted for optimum service.
The disk head then services requests according to this new queue.

F-SCAN Example

Disk queue with I/O requests: 54, 97, 73, 128, 15, 44, 110, 34, 45.
The disk head is at Cylinder 23 and moving in the direction of decreasing number of cylinders.
The disk consists of total 150 cylinders.
Two queues are maintained.
When the head processes at 15, a request arrives at Cylinder 10 but is added to the second queue since the original queue is frozen.
When the head reaches Cylinder 0, it reverses direction, and the original queue is merged with the second queue and sorted in ascending order.
New status example: When the head processes at 73, a request arrives at Cylinder 75 and is added to the second queue.
When the head processes at 97, new requests arrive at Cylinders 35, 99, and 100 and are added to the second queue.
However, no new requests in the second queue will be processed until the head has reached the end of the disk and reverses its direction, as the original queue must be entirely serviced first.
Once it does, the original list is merged with the second queue, sorted, and processed again.

N-Step-SCAN

Serves requests at intervals of $n$ in the queue when the head is moving in one direction.
For example, if $n = 4$ , the first four requests are serviced first.
When the sweep is complete, the next four requests in the queue are considered for service.
A large queue is divided into sub-queues, each containing $n$ requests.
New requests appearing while a sub-queue is being processed are added at the end of the original queue, helping to avoid starvation.
The value of $n$ should be chosen carefully for maximum algorithm performance.
If value of $n = 1$ , then it becomes a first-in-first-out (FIFO) algorithm.
If the value of $n$ is very large, then it is as good as SCAN algorithm.
Does not postpone the service of any old pending requests, reducing the variance of the response times compared to SCAN algorithm.
F-SCAN may prove to be better compared to N-SCAN algorithm.
N-SCAN algorithm may decrease the throughput, as it adds the new arriving requests at the end of request queue, and these cannot be serviced until all sub-queues have been serviced.
F-SCAN algorithm considers the new requests in its original queue after every sweep when the disk head changes its preferred direction.

Benefits of F-SCAN over N-SCAN:

High throughput
Low variance of the response time
Good average response time
Reduced starvation

N-Step-SCAN Example

Disk queue with I/O requests on the following cylinders in their arriving order: 6, 10, 12, 54, 97, 73, 128, 15, 44, 110, 34, 45.
The disk head is assumed to be at Cylinder 23 and moving in the direction of decreasing number of cylinders.
The disk consists of total 150 cylinders.
Suppose when the head is at 15, a request arrives at Cylinder 11, and when it is at 73, a request arrives at Cylinder 75. While at 97, new requests arrive at Cylinders 35, 99, and 100. Apply the N-SCAN algorithm for this situation for $n = 4$ .
Since the $n = 4$ , the first four requests in the queue, that is, 6, 10, 12, and 54 are considered for processing.
The head processes requests 12, 10, and 6 in its scan towards 0 and then reverses its direction for the next sweep.
The request 54, though, could not be serviced as it was not found in its previous sweep.
When the head was processing request at 73, a request for Cylinder 75 arrived but was added at the end of the queue and was not serviced.
Similarly, while processing at 97, new requests arrived for Cylinders 35, 99, and 100 but could not be processed and were added at the end of the original queue.
The head again reverses its direction.
The next sub-queue consists of 45, 75, 35, and 99 is serviced.
Finally, we have only two requests, 100 and 11, in the queue, which are serviced.

LOOK and C-LOOK Disk-Scheduling Algorithms

Modifications of SCAN algorithms that prevent unnecessary head movement to the last cylinder.

LOOK

The disk head looks ahead in its direction of movement for the last request.
It moves to the last request and changes its preferred direction there without moving to the last cylinder.
Eliminates unnecessary seek operations, decreasing average response time and lowering variance of response time.

LOOK Example

Disk queue with I/O requests: 6, 10, 12, 54, 97, 73, 128, 15, 44, 110, 34, 45.
The disk head is assumed to be at Cylinder 23 and moving in the direction of decreasing number of cylinders.
The disk head services requests at 15, 12, 10, and 6.
It reverses its direction at 6 and services other requests in its path.
The head stops at 128 without continuing to the last cylinder and reverses its direction if there are any new requests pending.
Total head movement is 139.

C-LOOK

Designed like the SCAN algorithm but can also treat the request queue as circular.
Similar to C-SCAN but moves to the last request instead of the last cylinder.
Achieves the benefits of both C-SCAN and LOOK scheduling algorithms.
Lowers the variance of the response time compared to the LOOK algorithm but incurs cost of reduced throughput.
It may increase the average response time also.

C-LOOK Example

The head services requests 15, 12, 10, and 6 and changes its direction.
Moves to the other end of the disk but looks for the first request there, that is, 128.
At Cylinder 128, it reverses its direction and services the requests in its path.
Total head movement in this algorithm will be reduced to 111, compared to 139 of the LOOK algorithm.

S-LOOK

An extension of the LOOK algorithm that handles the cases where the disk head is located among the far-end requests.
The algorithm decides which direction should be served first instead of only continuing to seek in the same direction before the new requests have arrived.
It calculates the total seek distance for the requests on both sides.
After calculating the seek distances in both directions, it continues to service the requests in the direction where the total seek distance is minimum.
S-LOOK provides better average seek time than the LOOK algorithm.

RAID (Redundant Array of Independent/Inexpensive Disks) Structure

The disk's slow access speed compared to the processor and memory is a critical factor in overall computer system performance.
RAID uses an array of disks accessed simultaneously to improve access speed and reliability.
Multiple disks operate independently and in parallel.
Data organization and redundancy increase reliability.
The RAID structure consists of multiple physical disk drives and a RAID controller, presenting a single logical drive to the OS.
Organized in seven levels (0 to 6).
The data is distributed among various disks such that the data is accessed simultaneously from multiple drives, thereby having quick access.

Aims of RAID Structure

Increased Performance:
- Multiple disks can be accessed in parallel.
- A file may be organized in such a way that the data is spread across multiple disks.
- Files are accessed from multiple disks simultaneously, improving traditional disk accesses.
- If two requests belong to two different disks, they can be served in parallel.
- More requests can be serviced compared to a single-disk system, increasing system throughput.
- Response time for accessing a large data is reduced.
Increased Reliability:
- Redundant copies of data are stored on multiple disk drives to improve reliability.
- Addresses redundancy to reduce the chance of data loss.

Improvement in Performance and Reliability

Data Striping:
- Data is stored on separate disks as a strip.
- A strip is a fixed-size block on the disk.
- A file is split among multiple disks in the form of strips.
- The contiguous strips of a file are stored on separate disks at the same location.
- The set of strips stored on the disks in the array is known as a stripe.
- The OS views the array of multiple disks as a logical/virtual single disk.
- If there are three disks in a stripe, the speed of accessing the data may increase by three times.
Strip size may affect access time, throughput, and data transfer time.
- Smaller strips are known as fine-grained strips.
  - Reduce the access time as compared to a single disk, thereby increasing the throughput and data transfer rate.
- Large-sized strips are known as coarse-grained strips.
  - Therefore, service more requests but reduce the data transfer rate.
Disk Mirroring:
- A duplicate of a strip is placed on a separate disk so that every disk has a mirror copy.
- If one of the disks fails, then the mirrored copy is accessed.
- Mirroring will reduce the storage capacity on RAID structure.
RAID Controller:
- Manages data stripping, disk mirroring, and redundancy.
- Breaks requests into different read commands, mapping them on multiple disks.

RAID Levels

RAID Level 0 (Striping):
- It is the simplest form of RAID structure wherein the data is distributed across all the disks in the form of strips.
- Maps strips into an array of consecutive disks in a round-robin manner.
- Does not provide any redundancy.
- If one of the disks in the array is not accessible, then the data or strips in that disk are lost.
- This level is not considered a true member of RAID structure.
- Increases the performance of disk access by the factor $n$ where $n$ is the number of disks in the disk array.
RAID Level 1 (Mirroring):
- Includes striping and redundancy via disk mirroring.
- The disk with the shortest seek time and rotational latency can be accessed.
- In a write request, every strip needs to be updated, but this can be done in parallel.
- Has good fault tolerance as the data may be recovered from the backup disk in case of any disk failure.
RAID Level 2 (Fine-grained Striping and ECC Parity):
- Uses fine-grained strips.
- Implements an ECC scheme to reduce storage overhead and cost rather than disk mirroring.
- Hamming ECC is used, calculating parity bits across corresponding bits on each disk.
- In case of a read operation, hamming code is calculated and compared with the values retrieved from the parity disks.
RAID Level 3 (Fine-grained Striping and Single-parity Disk):
- Based on bit-level or byte-level striping.
- Uses XOR ECC.
- The parity for the ith strip on each data disk is calculated by applying XOR operations on the contents of ith strip of each disk.
- Uses only a single-parity disk to store the parity of strips, thereby reducing the redundant storage space.
RAID Level 3 Example
- There are three data disks D1, D2, and D3, each consisting of a strip of a single bit. The parity bit of the ith strip is stored on the separate parity disk D4 and can be calculated as
  $D4(i) = D3(i) \bigoplus D2(i) \bigoplus D1(i)$ .
- Suppose disk D2 fails, and the data on it is lost. In this case, the data of D2 can be regenerated as follows:
  $D2(i) = D4(i) \bigoplus D3(i) \bigoplus D1(i)$ .
RAID Level 4 (Block Level XOR ECC Parity):
- Uses large-sized strips, and the data is striped as fixed-size blocks.
- Allows multiple I/O requests to be serviced in parallel.
- Uses exclusive-OR ECC as used in RAID level 3.
- The parity bit for the strip on each disk is calculated and stored on the parity disk.
RAID Level 5 (Block Level Distributed XOR ECC Parity):
- Each disk in the disk array participates in storing the parity strip in a round-robin fashion.
- The parity bits can be updated in parallel.
- Methods to reduce the overhead.
  - Cache the recently accessed data and parity strip.
  - Store only the difference between the old parity and new parity bits in the memory, which is known as parity logging.
  - Reduce the I/O operations during write operation to defer the parity generation in case of multiple small write operations at once and perform parity calculation only when the system load is light. This technique is known as A Frequently Redundant Array of Independent Disks (AFRAID).
RAID Level 6 (Block Level Dual Parity):
- Includes two types of parity calculations, one of them is XOR.
- The other calculation method is based on an algorithm that is independent of data stored on the disks, providing higher reliability on the disks' failure.
- The two types of parity strips (say, P1 and P2) calculated are stored on two separate disks, thereby increasing the redundant storage.
Combined RAID Levels
- RAID levels are combined to obtain better features of different RAID levels.
- RAID levels 0 and 1 have been combined to obtain RAID 0 + 1, which gives better performance and reliability.
- RAID 1 + 0 provides better reliability.