RAID Configurations and Concepts

Drive Redundancy and RAID

Introduction

  • Hard drives, SSDs, and other storage devices are used to store large amounts of important information.

  • Data loss is a concern due to the physical nature of these devices, which have constantly moving components.

  • Failure of any component can render all data on the drive inaccessible.

  • Redundancy can be created by combining multiple drives.

  • RAID (Redundant Array of Independent Disks) is not a backup solution; a separate backup process is still necessary.

RAID Definition

  • RAID stands for Redundant Array of Independent Disks (sometimes also referred to as Redundant Array of Inexpensive Disks).

  • It involves different methods to implement redundancy.

  • Some RAID methods provide redundancy even if a drive is lost, while others do not.

RAID Levels

  • RAID 0 (Striping)

  • RAID 1 (Mirroring)

  • RAID 5 (Striping with a single Parity Drive)

  • RAID 6 (Striping with two Parity Drives)

  • Nested RAID (RAID 10 or RAID 1+0, Stripe of Mirrors)

RAID 0 (Striping)

  • Requires at least two physical drives.

  • Data is split across multiple storage devices.

  • A single file is divided into multiple parts, with each part stored on a different drive.

  • Example: Drive 1 has Block 1A, Drive 2 has Block 2A, Drive 1 has Block 3A, and so on.

  • Known for its speed because data is written to multiple drives simultaneously.

  • Losing one drive results in data loss due to the unavailability of parts of the file.

  • RAID 0 has zero redundancy.

RAID 1 (Mirroring)

  • Requires at least two drives.

  • Data is duplicated across multiple drives.

  • Disk 0 has an exact duplicate of the information on Disk 1.

  • Requires twice as much storage space to store the same information.

  • If a drive fails, the other drive containing the exact duplicate of the data remains available.

  • The failed drive should be replaced, and the mirror recreated but data remains accessible during this process.

RAID 5 (Striping with Parity)

  • Uses parity to provide redundancy.

  • Striping is similar to RAID 0, where a file is split into pieces and distributed across multiple drives.

  • The last drive contains parity data, not file data.

  • In a RAID 5 array with four drives, three drives store data, and one stores parity data.

  • Parity is distributed across multiple physical drives to make the recovery process more efficient.

  • Doesn't require duplicating all data, thus saving drive space.

  • If a drive fails, the data can be recreated using the remaining data and parity information.

  • Parity calculation requires CPU overhead, which causes performance impact during recovery.

RAID 6 (Striping with Dual Parity)

  • Similar to RAID 5 but adds an additional storage drive with an additional parity block.

  • Can withstand the failure of two drives.

  • If one drive fails, it functions like RAID 5.

  • If two drives fail, the lost data can be recreated using the existing parity data.

  • Losing two physical drives in a RAID 6 array still allows access to all data.

  • Requires a separate physical drive to store the additional parity data.

  • It does not add additional capacity, only redundancy.

RAID 10 (RAID 1+0, Stripe of Mirrors)

  • Combines RAID 0 (Striping) with RAID 1 (Mirroring).

  • With RAID 0, a single file is split into blocks and distributed across multiple drives.

  • RAID 0 has zero redundancy, so losing a drive results in data loss.

  • RAID 1+0 adds mirroring to the striped set of drives.

  • Requires at least four drives.

  • Each striped set of drives is mirrored onto separate drives.

  • Can withstand the loss of multiple drives and still remain operational.

  • For example, in one scenario, losing one drive per mirror still keeps the system running.

Practical Considerations and Redundancy Cost

  • RAID 0:

    • No redundancy

    • All disk space is usable

    • If you have two 5TB drives that is 10 TB of available space

  • RAID 1:

    • Fault tolerance of 1/2 (one half)

    • If you have 20 TB, using all 5 TB sizes, you will only be able to save 10 TB of files.

  • RAID 5:

    • Needs a minimum of three drives.

    • Loses one disk space to the parity.

    • If you have 5 by 4, you will only get 15 as one gets dedicated for party.

  • RAID 6:

    • Needs a minimum of four drives.

    • Two disks get dedicated to party.

  • RAID 10:

    • Must have a multiple of 4 to create the sets of mirrors.

    • It's expensive as you must lose half the space for redundancy.

Questions and Parity Explained

  • Once setup RAID is automatic.

  • RAID is usually about creating redundancy in disk so that if one disc or disk fails, it can rebuild.

  • RAID can just rebuild itself.

  • Parity is a mathematical computation that is applied that gives the parity bit.

  • Parity bits allow you to rebuild your array by acting like a magic spell that can do calculations.

  • With parity, instead of a physical copy, you are making a tiny little copy with that parity. You will need more space for the parity bit to work from

  • Parity is distributed throughout the arrays.

  • With Distributed parity, not one disc has decided to be the parity disc, but instead, all discs have parity in them.

  • Fault tolerance means how many arrays can I or how many disk can I lose in my system still function.

  • Hot Swappable means I can switch it while the system is still on.

  • Minimum Disc: Raid has to have a minimum disc and with RAID 0, that number is 2.

Final Words
  • Study RAID as it can show up on test and be a PBQ.

Introduction
  • Hard drives, SSDs, and other storage devices store large amounts of important information.

  • Data loss is a concern due to the physical nature of these devices, which have constantly moving components and magnetic platters that can degrade over time.

  • Component failure can render all data on the drive inaccessible. This includes electronic components and mechanical parts.

  • Redundancy can be created by combining multiple drives into a single logical unit.

  • RAID (Redundant Array of Independent Disks) is not a backup solution; a separate backup process is still necessary to protect against data loss due to other types of failures.

RAID Definition
  • RAID stands for Redundant Array of Independent Disks (sometimes also referred to as Redundant Array of Inexpensive Disks).

  • It employs various methods to implement redundancy, which can include mirroring, striping, and parity.

  • Some RAID methods provide redundancy even if a drive is lost, while others do not, making it essential to choose the right RAID level for your needs.

RAID Levels
  • RAID 0 (Striping)

  • RAID 1 (Mirroring)

  • RAID 5 (Striping with a single Parity Drive)

  • RAID 6 (Striping with two Parity Drives)

  • Nested RAID (RAID 10 or RAID 1+0, Stripe of Mirrors)

RAID 0 (Striping)
  • Requires at least two physical drives to operate.

  • Data is split across multiple storage devices to increase performance.

  • A single file is divided into multiple parts, and each part is stored on a different drive.

  • Example: Drive 1 has Block 1A, Drive 2 has Block 2A, Drive 1 has Block 3A, and so on, allowing for parallel read and write operations.

  • Known for its speed because data is written to multiple drives simultaneously, effectively multiplying the read and write speeds.

  • Losing one drive results in data loss due to the unavailability of parts of the file, making it unsuitable for critical data.

  • RAID 0 has zero redundancy, meaning there is no fault tolerance.

RAID 1 (Mirroring)
  • Requires at least two drives to create a mirror.

  • Data is duplicated across multiple drives, providing complete redundancy.

  • Disk 0 has an exact duplicate of the information on Disk 1, ensuring data availability in case of a drive failure.

  • Requires twice as much storage space to store the same information, effectively halving the usable storage capacity.

  • If a drive fails, the other drive containing the exact duplicate of the data remains available, allowing for continuous operation.

  • The failed drive should be replaced, and the mirror recreated, but data remains accessible during this process with minimal downtime.

RAID 5 (Striping with Parity)
  • Uses parity to provide redundancy without fully duplicating the data.

  • Striping is similar to RAID 0, where a file is split into pieces and distributed across multiple drives.

  • The last drive contains parity data, not file data, which is calculated from the data on the other drives.

  • In a RAID 5 array with four drives, three drives store data, and one stores parity data, distributed in a rotating manner.

  • Parity is distributed across multiple physical drives to make the recovery process more efficient and balanced.

  • Doesn't require duplicating all data, thus saving drive space compared to RAID 1.

  • If a drive fails, the data can be recreated using the remaining data and parity information, ensuring data availability.

  • Parity calculation requires CPU overhead, which causes performance impact during recovery, potentially slowing down the system.

RAID 6 (Striping with Dual Parity)
  • Similar to RAID 5 but adds an additional storage drive with an additional parity block, providing enhanced fault tolerance.

  • Can withstand the failure of two drives simultaneously without data loss.

  • If one drive fails, it functions like RAID 5, using the remaining data and parity to maintain operation.

  • If two drives fail, the lost data can be recreated using the existing parity data from the other drives.

  • Losing two physical drives in a RAID 6 array still allows access to all data, making it suitable for critical systems.

  • Requires a separate physical drive to store the additional parity data, reducing the overall usable storage capacity.

  • It does not add additional capacity, only redundancy, ensuring data protection.

RAID 10 (RAID 1+0, Stripe of Mirrors)
  • Combines RAID 0 (Striping) with RAID 1 (Mirroring) to offer both performance and redundancy.

  • With RAID 0, a single file is split into blocks and distributed across multiple drives, increasing read and write speeds.

  • RAID 0 has zero redundancy, so losing a drive results in data loss, which is mitigated by the mirroring in RAID 10.

  • RAID 1+0 adds mirroring to the striped set of drives, creating a highly available and fast storage solution.

  • Requires at least four drives to implement, with each pair of drives forming a mirrored set.

  • Each striped set of drives is mirrored onto separate drives, providing redundancy and improving performance.

  • Can withstand the loss of multiple drives and still remain operational, depending on which drives fail.

  • For example, in one scenario, losing one drive per mirror still keeps the system running, ensuring high availability.

Practical Considerations and Redundancy Cost
  • **RAID 0:** - No redundancy

    • All disk space is usable

    • If you have two 5TB drives that is 10 TB of available space

  • **RAID 1:** - Fault tolerance of 1/2 (one half)

    • If you have 20 TB, using all 5 TB sizes, you will only be able to save 10 TB of files.

  • **RAID 5:** - Needs a minimum of three drives.

    • Loses one disk space to the parity.

    • If you have 5 by 4, you will only get 15 as one gets dedicated for party.

  • **RAID 6:** - Needs a minimum of four drives.

    • Two disks get dedicated to party.

  • **RAID 10:** - Must have a multiple of 4 to create the sets of mirrors.

    • It's expensive as you must lose half the space for redundancy.

Questions and Parity Explained
  • Once setup RAID is automatic, managed by the RAID controller.

  • RAID is usually about creating redundancy in disk so that if one disc or disk fails, it can rebuild, ensuring data integrity.

  • RAID can just rebuild itself, often in the background, without interrupting system operations.

  • Parity is a mathematical computation that is applied that gives the parity bit, used to reconstruct lost data.

  • Parity bits allow you to rebuild your array by acting like a magic spell that can do calculations, restoring the data.

  • With parity, instead of a physical copy, you are making a tiny little copy with that parity. You will need more space for the parity bit to work from

  • Parity is distributed throughout the arrays, enhancing fault tolerance.

  • With Distributed parity, not one disc has decided to be the parity disc, but instead, all discs have parity in them, improving performance.

  • Fault tolerance means how many arrays can I or how many disk can I lose in my system still function, indicating the system's resilience.

- Hot Swappable means I can switch it while the system is still on, minimizing downtime.

Final Words
  • Study RAID as it can show up on test and be