34 - RAID - 3.4

Hard Drives and Data Storage

  • Importance of Data Storage

    • Hard drives, SSDs, and other storage devices are essential for storing large amounts of information.

    • Important to prevent data loss that can happen if the storage devices malfunction.

  • Mechanics of Hard Drives

    • Hard drives are physical devices with moving parts:

    • Platters spin to read/write data.

    • Actuator arms move to access different data regions.

    • Failure of any components renders the data inaccessible.

RAID (Redundant Array of Independent Disks)

  • Definition of RAID

    • RAID stands for Redundant Array of Independent Disks; alternatively known as Redundant Array of Inexpensive Disks.

    • RAID is not a backup solution but a method to create redundancy in data storage.

  • Importance of Separate Backup

    • Users must maintain a completely separate backup process even when using RAID configurations.

Types of RAID Configurations

  1. **RAID 0 (Striping) **

    • Overview: Requires at least two physical drives.

    • Data Distribution: Data split across multiple drives:

      • E.g., a single file can be divided into eight parts, stored across two drives:

      • Drive A: Block 1a, Block 3a, etc.

      • Drive B: Block 2a, Block 4a, etc.

    • Speed: Increases speed due to simultaneous write operations across drives.

    • Redundancy: Has zero redundancy; loss of one drive results in complete data loss.

  2. RAID 1 (Mirroring)

    • Overview: Requires at least two drives with identical data stored on both.

    • Data Storage::

      • Data on Disk 0 has an exact duplicate on Disk 1.

    • Storage Requirement: Effectively doubles the storage requirement.

    • Redundancy: Offers redundancy; if one drive fails, the other can still provide access to data.

    • Recovery Process: Replacement of the failed drive should occur immediately to recreate the mirror.

  3. RAID 5 (Striping with Parity)

    • Overview: Requires a minimum of three physical drives.

    • Data and Parity: Similar to RAID 0 but includes parity data on the last drive:

      • E.g., for four physical drives, three have data, and one holds the parity.

    • Efficiency:

      • Stores parity data, avoiding duplication of the entire dataset (less storage waste).

    • Data Recovery: Losing one drive allows recovery using remaining data and parity.

    • Performance Implication: Additional CPU overhead for parity calculation may affect performance temporarily.

  4. RAID 6 (Striping with Double Parity)

    • Overview: Similar to RAID 5 but with an additional parity block and requires at least four drives.

    • Data Protection: Can withstand the loss of two drives without data loss:

      • Both parity blocks allow for reconstruction of lost data.

    • Capacity Implication: Adding more physical drives for redundancy does not increase storage capacity due to additional parity.

  5. RAID 10 (RAID 1+0) - Nested RAID

    • Overview: Combines the features of RAID 1 (mirroring) and RAID 0 (striping).

    • Drive Requirements: Requires a minimum of four drives.

    • Data Storage:

      • Split into stripes, where each striped set is mirrored.

    • Redundancy: Allows for higher redundancy:

      • Failure of one drive from each mirror still leaves data accessible due to remaining drives.

      • E.g., losing one drive from each of two separate mirrored sets allows continued operation.

Summary of RAID Features

  • RAID offers various configurations with differing:

    • Speed and efficiency

    • Levels of data redundancy

    • Storage capacity implications

  • Users must carefully select RAID types based on their specific needs for speed, redundancy, and available storage capacity.