Data Redundancy (OBJ 3.4)

Data Redundancy

  • Definition of Data Redundancy
    • Data redundancy refers to the practice of storing identical copies of data in multiple physical storage devices to ensure data availability and integrity.

RAID (Redundant Array of Independent Disks)

  • Definition of RAID
    • RAID is a technology that combines multiple physical storage devices into a single logical storage device recognized by the operating system.
  • Purpose of RAID
    • To create data redundancy, improve performance and increase data availability.
  • Common RAID Types
    • RAID 0, RAID 1, RAID 5, RAID 6, RAID 10

RAID 0

  • Definition
    • RAID 0 provides data striping across multiple disks to enhance performance.
  • Key Feature
    • Striping: Data is split across several drives.
  • Characteristics
    • No redundancy or fault tolerance. If one drive fails, all data is lost.
    • Ideal for scenarios where performance is prioritized over data security, such as in high-end video editing workstations.
  • Example
    • Using two hard disk drives to work on raw video files faster than using a single drive.
  • Requirements
    • Minimum of two disks configured to work together.

RAID 1

  • Definition
    • RAID 1 provides redundancy through data mirroring across two storage devices.
  • Key Feature
    • Mirroring: Data is duplicated identically on both drives.
  • Characteristics
    • Offers high availability; if one disk fails, the other continues to operate uninterrupted.
    • Provides minimal downtime as there's always a full copy of the data available.
  • Example
    • Using a RAID 1 configuration for storing completed videos on a high-end video editing workstation, ensuring an online and on-demand backup.
  • Requirements
    • Uses two physical storage devices, resulting in a single logical storage unit.

RAID 5

  • Definition
    • RAID 5 is characterized by data striping with parity across multiple disks.
  • Key Feature
    • Striping with Parity: Data and parity information are distributed across the drives.
  • Characteristics
    • Requires a minimum of three disks.
    • Fault tolerant; if one drive fails, the remaining disks can reconstruct the data.
    • Hot-swapping capability; allows replacing a failed disk while the array remains operational.
  • Example
    • A server using RAID 5 can still function and rebuild the data after replacing a failed disk.

RAID 6

  • Definition
    • RAID 6 is a modification of RAID 5 that includes double parity for enhanced reliability.
  • Key Feature
    • Striping with Double Parity: Two sets of parity data are utilized.
  • Characteristics
    • Requires at least four storage devices.
    • Can handle the failure of two disks without data loss or downtime.
  • Example
    • Due to its double parity structure, RAID 6 offers greater resiliency compared to RAID 5.

RAID 10 (RAID 1+0)

  • Definition
    • RAID 10 combines features of RAID 1 and RAID 0, offering both mirroring and striping.
  • Key Feature
    • Striped Array of Mirrored Arrays: Data is both mirrored and striped, providing performance and redundancy.
  • Characteristics
    • Minimum of four disks, configured as two sets of mirrored RAID 1s that are then striped together.
    • Allows for fault tolerance while providing improved read/write speeds.
    • Can withstand the failure of two disks as long as they are from different mirrored sets.
  • Example
    • If one disk from each mirrored set fails, the array continues operating, but fault tolerance is compromised until failures are replaced.

Classification of RAID Systems

Failure-Resistant Systems

  • Definition
    • Designed to withstand specific hardware malfunctions without data loss.
  • Components
    • Achieved by mirroring data across multiple devices as in RAID 1 or RAID 10.

Fault-Tolerant Systems

  • Definition
    • Systems that allow continued operation without downtime during a hardware failure.
  • Components
    • Include mirroring or striping with parity, like in RAIDs 1, 5, 6, and 10.
  • Benefits
    • Allows quick data recovery from healthy devices for system resilience.

Disaster-Tolerant Systems

  • Definition
    • Provide a broader protection level against catastrophic events.
  • Components
    • Utilize independent zones with full data access, ensuring backup copies are available.
  • Examples
    • RAID 1 and RAID 10 can be considered disaster-tolerant due to their mirroring capabilities.

Summary of Key RAID Functions

  • RAID 0: Enhances performance via striping but lacks redundancy.
  • RAID 1: Maintains exact data copies for redundancy and availability.
  • RAID 5: Distributes data with parity for performance and fault tolerance.
  • RAID 6: Similar to RAID 5 but adds double parity for better data protection.
  • RAID 10: Combines RAID 1 and RAID 0 to improve performance and redundancy, allowing for continued operation with some drive losses.