Lesson 13: Explain the Importance of Resilience and Recovery in Security Architecture

0.0(0)
Studied by 1 person
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/71

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 9:11 AM on 5/28/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

72 Terms

1
New cards

Lesson 13

Backup and Recovery

Backup and recovery processes ensure that accurate and reliable copies of data and system configurations are created, maintained, and tested.

Traditional
Online
Replication
Automation

2
New cards

Traditional (Backup and Recovery)

Traditional backup and recovery generally uses removable media or local disk library.

3
New cards

Online (Backup and Recovery)

Online backup and recovery preserve data by creating copies of it and storing them in an online or cloud-based environment.

4
New cards

Replication

Replication is the process of creating and maintaining multiple copies of data across different locations.

5
New cards

Automation (Backup and Recovery)

Automation approach to automated provisioning and replacement.

6
New cards

Traditional Backup Strategies

Full Backup
Differential
Incremental

7
New cards

Full Backup

Backs up all files. Restore requires full backup media.

8
New cards

Differential

Backs up all files created or modified since last full backup. Does not reset archive bit. Restore requires full backup + most recent differential.

9
New cards

Incremental

Backs up all changed files. Does reset archive bit. Restore requires full backup + all subsequent incremental.

10
New cards

Disk-to-Disk Options

Network-attached Storage (NAS)
Storage Area Network (SAN)

11
New cards

Network-attached Storage (NAS)

A NAS is a file-dedicated storage device.
- Connects over Ethernet
- Relatively inexpensive to add additional NAS devices

12
New cards

Storage Area Network (SAN)

A SAN is a specialized high-speed (fibre-channel) network that provides network access to storage devices. SAN creates an image by mirroring a production disk to another disk inside the storage array.

13
New cards

Online Backup Strategies

Cloud
Disk Shadowing
Electronic Vaulting
Remote Journaling

14
New cards

Cloud (Online Backup)

Scheduled backup to a cloud location or cloud backup service.

15
New cards

Disk Shadowing

In disk shadowing, data is written to (and read from) two or more independent disks. Process is transparent to the user.

16
New cards

Electronic Vaulting

Electronic vaulting copies files as they change and periodically transmits them to a secure backup location.

17
New cards

Remote Journaling

Remote journaling copies and periodically transmits logs to a backup location.

18
New cards

Replication Strategies

Point-in-Time
Asynchronous
Synchronous

19
New cards

Point-in-Time (Replication)

Periodic snapshots replicated.

20
New cards

Asynchronous (Replication)

Asynchronous replication is an automated process that streams copies of data. Write is considered complete as soon as local storage commits. Remote storage updated with a slight time lag.

21
New cards

Synchronous (Replication)

Synchronous replication is an automated process that streams copies of data. Both write operations must successfully complete before the system can proceed. Guaranteed zero data loss.

22
New cards

Automation Strategies

Infrastructure-as-Code (IAC)
Immutable System

23
New cards

Infrastructure-as-Code (IAC)

IAC uses code (configuration files) to manage configurations and automate provisioning of infrastructure.
- Solves the problem of configuration drift.

24
New cards

Immutable System

Immutability is the principle that resources should not be changed, only created and destroyed. (e.g., image of a system preconfigured to a desired "known good" state). Uses automation to replace rather than fix.

25
New cards

Cost Balancing

Inverse relationship between the Cost of Disruption and the Cost to Recover.

26
New cards

Traditional Recovery

  • Tape backup
    - Low complexity
    - Low cost
    - Recovery measured in hours to days
    - More recoverable

27
New cards

Enhanced Recovery

  • Automated solutions
    - Medium complexity
    - Low cost
    - Recovery measured in hours to days
    - More recoverable

28
New cards

Rapid Recovery

  • Asynchronous replication
    - High complexity
    - Moderate cost
    - Recovery measured in minutes to hours

29
New cards

Continuous Availability

  • High complexity
    - High cost
    - Recovery measured in seconds

30
New cards

Resiliency

Resiliency is the capability to continue operating even when there has been a disruption or abnormal operating conditions.

Redundancy is duplication of critical components or functions with the intention of increasing reliability and mitigating the risks associated with single point of failure (SPOF).

- Fault tolerance is the capability of a system to continue to operate in the event of failure of one or more system components (redundancy).
- Categories of resiliency include system, storage, power, transmission, and site.

31
New cards

System Resiliency

Load Balancing
Clustering
High Availability
Fail-secure

32
New cards

Load Balancing

Load balancing involves distributing incoming network traffic across multiple independent systems to ensure that no single server becomes overwhelmed with requests.

33
New cards

Clustering

Clustering involves grouping multiple systems together to form a single logical unit or cluster.

34
New cards

High Availability

High Availability (HA) is automatic failover capability which reduces or eliminates the need to activate redundant hardware.
- Asymmetric (active/passive)
- Symmetric (active/active)

35
New cards

Fail-secure

Principle that a failure will result in a secure or trustworthy state.

36
New cards

RAID

Redundant Array of Independent Disks (RAID) is a data storage virtualization technology.

- RAID combines multiple disk drive components into one or more logical units for the purposes of fault tolerance (data redundancy) and/or performance improvement.
- RAID can be configured to mirroring, striping, or both.
- Disk mirroring is the process of writing data on two partitions on separate disks.
- Disk striping is the process of dividing data into blocks and spreading the data blocks across multiple storage devices.

37
New cards

Power Resiliency

Redundancy
UPS Battery Backup
Generator
Supplier Diversity

38
New cards

Redundancy (Power)

Component level: Having two or more power supplies and fans.

39
New cards

UPS Battery Backup

An uninterruptible power supply (UPS) provides backup power when a regular power source fails, or voltage drops to an unacceptable level. Battery is finite.

40
New cards

Generator

A generator is a standby, secondary, limited source of electrical power when the power grid is down or inaccessible. Fuel must be available.

41
New cards

Supplier Diversity

More than one supplier and/or access to multiple power grids.

42
New cards

Transmission Routing Resiliency

Alternate Routing
Diverse Routing
Last-mile Circuit Protection

43
New cards

Alternate Routing

Multiple paths for data to travel between two points. The network can automatically reroute traffic to an alternate path if the primary path becomes unavailable or congested.

44
New cards

Diverse Routing

Data is transmitted over multiple geographically diverse paths or routes.

45
New cards

Last-mile Circuit Protection

Redundant last-mile circuits, such as multiple fiber optic or copper cables, to provide backup paths for data transmission in case of a failure or outage on the primary circuit.

46
New cards

Alternate Physical Sites

Cold Site
Warm Site
Hot Site
Mirrored

47
New cards

Cold Site

A cold site has basic HVAC infrastructure. No server-related or communications equipment.

48
New cards

Warm Site

A warm site has HVAC, servers, and communications infrastructure and equipment. Systems needs to be configured (updated). Data needs to be restored.

49
New cards

Hot Site

A hot site has HVAC, servers, and communications infrastructure and equipment. Fully configured and ready to operate. Data has been replicated.

50
New cards

Mirrored

A mirrored site is an identical (or nearly identical site) that is operational in concert with the primary site on a load-balancing basis.

51
New cards

Alternate Third-Party Sites

Mobile Site
Reciprocal Site
DRaaS

52
New cards

Mobile Site

A mobile site is a transportable modular unit with pre-ordered hardware and software. The delivery site must provide access roads, water, waste disposal, power, and connectivity.

53
New cards

Reciprocal Site

A reciprocal site is based on an agreement to have access to/use of another organization's facilities.

54
New cards

DRaaS

Cloud-based Disaster-Recovery-as-a-Service offers full recovery in a cloud-based environment.

55
New cards

Continuity of Operations

In its simplest form, continuity of operations is the capability of a business to continue to operate in adverse (disaster) conditions.

- In a business context, disasters are disruptive events that significantly impact an organizations capability to operate.
- The impact could be to people, technology, facilities or any combination thereof.

56
New cards

Adverse Conditions

External
Infrastructure
Human

57
New cards

External (Adverse Conditions)

Large scale geological or meteorological events such as earthquakes, storms, floods, hurricanes, tornadoes, wildfires. Environmental events such as pollution, sea rise. Public health events such as a pandemic.

58
New cards

Infrastructure (Adverse Conditions)

Loss of service such as electricity, HVAC, water. Technical issues such as equipment or communications failure.

59
New cards

Human (Adverse Conditions)

Workplace accidents, Walkouts, Strikes, Civil disturbance, Cyber attacks, Cyber warfare, War and terrorism.

60
New cards

Continuity of Operations Governance

Continuity of operations is a shared responsibility.

- Board of Directors (or equivalent) are responsible for approval of continuity of operations policies and oversight of strategy development and testing.
- Management is responsible for the development of strategic and tactical plans and procedures, external relationships, training, testing, and audit.
- Business units are responsible for developing unit-specific procedures.

61
New cards

Continuity of Operations Planning

The objective of continuity of operations planning is to prepare for continued operation.

Disaster Recovery Plans (DRP)
Business Continuity Plans (BCP)

62
New cards

Disaster Recovery Plans (DRP)

DRP focus on the recovery and restoration of technology, physical plant, and personnel.

63
New cards

Business Continuity Plans (BCP)

BCP focuses on the overall strategy for sustaining business activities during a disaster (or smaller interruption) and subsequent recovery period.

64
New cards

Continuity of Operations Planning Workflow

1. Project Initiation
2. Business Impact Analysis
3. Plan Development
4. Procedure Development
5. Training
6. Testing
7. Auditing
8. Maintenance Review & Update

65
New cards

Plan Readiness

Continuity of operations plans (DRP, BCP) should be maintained in a state of readiness.

- Personnel trained to fulfill their roles and responsibilities within the plan.
- Plans and strategies exercised to validate their content.
- Systems and system components tested on a scheduled basis to ensure their recovery and operability.
- Plan examination and auditing to ensure compliance with business objectives.

66
New cards

Testing Objectives

The objective of testing should be to evaluate continuity of operations strategies, plans, and procedures; not institutional knowledge.

- The outcome of testing should be strategy, plan, and procedure modifications (if necessary), and an enhanced participant familiarity with all the facets of the plan.

67
New cards

Testing Approaches

Tabletop
Failover
Simulation

68
New cards

Tabletop (Testing)

Tabletop testing is a hypothetical group workshop that focuses on the application of plans and procedures as well as identifying gaps in their preparedness.

69
New cards

Failover (Testing)

Failover testing is performed to evaluate the ability of a system or application to recover from a failure and switch to a backup or secondary system or component seamlessly.

70
New cards

Simulation (Testing)

In a simulation, DR and/or BCP plans are executed in a controlled environment (e.g., staging), to simulate a real-world disaster or outage. The simulation can be done at different levels of granularity.

71
New cards

Parallel Processing

Business continuity parallel processing is a complex and costly strategy to ensure uninterrupted business operations during unexpected events or disruptions.

- Parallel processing requires the implementation of parallel processing systems that can handle critical business functions simultaneously or in parallel, thereby minimizing the impact of disruptions on overall operations.

72
New cards

Plan Audit

A plan audit provides management with an independent assessment of the effectiveness of plans, procedures, training, and testing, as well as strategic alignment assurance.

- The type and the extent of auditing performed depend on the risks involved, management's assurance requirements, and the availability of audit resources.