RL

Chapter 21 - Business Continuity, Disaster Recovery, and Change Management

Change Management

  • Definition: The structured process of managing the transition of large software systems from development to implementation, including tracking changes during development and operations.

  • Key Aspects:

    • Ensures controlled and documented changes.

    • Minimizes disruptions during transitions.


Configuration Management

  • Definition: The application of processes to manage changes to both software and hardware configurations.

  • Key Aspects:

    • Tracks system configurations.

    • Ensures consistency across environments.


Business Continuity (BC)

  • Definition: Keeping a business operational during and after disruptive events through advanced planning.

  • Key Components:

    • Disaster Recovery Plan (DRP): Focuses on recovery after a disaster, prioritizing human safety and critical systems.

    • Business Impact Analysis (BIA):

      • Documents the impact of disruptions on operations.

      • Identifies critical systems and prioritizes backups/restoration.

    • Single Points of Failure: Systems relying on one component (e.g., routers, power supplies) must be mitigated.


Backups

  • Key Considerations:

    • Frequency: How often backups occur.

    • Extent: What data/applications are backed up.

    • Storage: Where backups are kept (geographic diversity recommended).

    • Retention: How long backups are stored (follow the "Rule of Three"—rotate backups).

    • Responsibility: Who ensures backups are created/maintained.

Backup Types:
  1. Full Backup: Copies all files/software.

  2. Differential Backup: Copies files changed since the last full backup.

  3. Incremental (Delta) Backup: Backs up only new/changed data since the last backup.

  4. Snapshots: Point-in-time copies of virtual machines (VMs).

Backup Challenges:
  • Long-Term Storage: Magnetic media degrades; rotate/update backups.

  • Encryption: Ensure multiple personnel know decryption keys.

  • Legal/Compliance: Data sovereignty laws may require backups in specific countries.


Alternative Sites for Disaster Recovery

Site Type

Cost

Recovery Speed

Complexity

Hot Site

High

Immediate/Few Hours

Low

Warm Site

Moderate

Days

Moderate

Cold Site

Low

Weeks

High

  • Mutual Aid Agreement: Organizations agree to support each other (risky if both are hit).


Restoration Priorities

  1. Dependencies: Systems required for others to function.

  2. Critical Infrastructure: Core business systems.


Utilities & Power

  • Short Outages: Use UPS (Uninterruptible Power Supply).

  • Extended Outages: Deploy backup generators.


Secure Recovery

  • Ensures access to critical data/files remotely during disruptions.


Continuity of Operations Planning (COOP)

  • Defines which operations must continue during disruptions.


Disaster Recovery Plan (DRP)

  • Key Elements:

    • Recovery Time Objective (RTO): Target time to resume operations (shorter = higher cost).

    • Recovery Point Objective (RPO): Maximum acceptable data loss (dictates backup frequency).

  • Testing: Regularly rehearse the plan.


Configuration Management Processes

  1. Configuration Identification: Tagging assets (e.g., hardware/software) as Configuration Items (CIs).

  2. Baseline: A stable reference point for comparison.

  3. Configuration Status Accounting: Tracking CI changes.

  4. Configuration Auditing: Verifying compliance with policies.


Change Control

  • Change Control Board (CCB): Approves/reviews changes.

  • System Problem Report (SPR): Tracks change requests.

  • Backout Plan: Reverts changes if issues arise.


Capability Maturity Model Integration (CMMI)

  • Levels:

    1. Initial: Chaotic processes.

    2. Managed: Planned processes.

    3. Defined: Standardized processes.

    4. Quantitatively Managed: Metrics-driven.

    5. Optimizing: Continuous improvement.


Development Environments

  1. Development: Where code is written.

  2. Test: Mirrors production for validation.

  3. Staging: Optional; tests partial deployments.

  4. Production: Live systems with real data.


Security Practices

  • Secure Baseline: Systems with patches/settings applied.

  • Sandboxing: Isolates systems to prevent issues from spreading.

  • Integrity Measurement: Detects unauthorized changes.


Key Takeaways

  • Plan Ahead: BIA and DRP are critical.

  • Test Backups/DRP: Ensure they work when needed.

  • Avoid Single Points of Failure: Redundancy is key.

  • Document Everything: Changes, configurations, and recovery steps.