Chapter 16 High Availability and Disaster Recovery

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/38

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

39 Terms

1
New cards

Fault Tolerance

The ability of a system to continue operating properly in the event of a failure of one or more components.

2
New cards

High Availability

Techniques and configurations designed to ensure a system or service remains accessible and operational with minimal downtime.

3
New cards

Load Balancing

Distributes network or application traffic across multiple servers to improve responsiveness and redundancy.

4
New cards

NIC Teaming

Combines multiple network interface cards into one logical interface to improve bandwidth and provide redundancy.

5
New cards

Multipathing

Uses multiple physical paths between a computer and a storage device to ensure fault tolerance and load balancing.

6
New cards

Switch/Router/Firewall Clustering

Groups multiple devices to work together for redundancy and high availability.

7
New cards

Active-Active Configuration

All systems are operational and handle traffic simultaneously.

8
New cards

Active-Passive Configuration

One system is active while the others are on standby and become active only during failure.

9
New cards

FHRP (First Hop Redundancy Protocol)

Protocol group that enables a virtual router to provide gateway redundancy (e.g., HSRP, VRRP).

10
New cards

HSRP (Hot Standby Router Protocol)

Cisco proprietary FHRP that allows a backup router to take over if the primary fails.

11
New cards

VRRP (Virtual Router Redundancy Protocol)

Open standard alternative to HSRP (RFC 2338).

12
New cards

HSRP MAC Address Group Number

Determined by the last two hexadecimal digits (e.g., 0000.0c07.ac0a = group 10).

13
New cards

Uninterruptible Power Supply (UPS)

Provides temporary power during outages to maintain uptime until generators activate or systems shut down properly.

14
New cards

Standby UPS

Switches to battery power when utility power fails.

15
New cards

Line-Interactive UPS

Maintains charge and conditions power, better for minor fluctuations.

16
New cards

Online UPS

Provides constant, clean power from battery-powered inverter; best for critical data centers.

17
New cards

Generator

Provides long-term backup power after UPS battery is exhausted.

18
New cards

Power Distribution Unit (PDU)

Distributes and manages power in server racks.

19
New cards

Power Load

The total power drawn by all connected devices.

20
New cards

Environmental Controls

Include temperature, humidity, and fire suppression systems to maintain optimal conditions in data centers.

21
New cards

HVAC

Heating, ventilation, and air conditioning system critical for temperature control.

22
New cards

Humidity

Must be controlled to prevent electrostatic discharge or condensation.

23
New cards

Fire Suppression (Clean Agent)

Gas-based systems (like FM-200) that extinguish fire without damaging electronics.

24
New cards

Deluge System

Water-based system; not suitable for electronics.

25
New cards

Dry Pipe / Preaction Systems

Triggered only under specific conditions; reduces risk of accidental water discharge.

26
New cards

Disaster Recovery Site

Alternate location used to restore operations in case the primary site becomes unusable.

27
New cards

Cold Site

Basic infrastructure; takes longest to set up after disaster.

28
New cards

Warm Site

Preconfigured with some systems and data; faster recovery than cold site.

29
New cards

Hot Site

Fully operational duplicate of primary site with live data and systems; fastest recovery.

30
New cards

Cloud Site

Virtual DR site hosted in cloud, mimics on-premises network.

31
New cards

RPO (Recovery Point Objective)

Maximum tolerable amount of data loss (i.e., how far back in time the recovery point can be).

32
New cards

RTO (Recovery Time Objective)

Maximum tolerable downtime; time needed to recover after disruption.

33
New cards

MTTR (Mean Time to Repair)

Average time to restore a failed system.

34
New cards

MTBF (Mean Time Between Failures)

Expected time between system failures.

35
New cards

Tabletop Exercise

A discussion-based simulation where team members walk through a hypothetical disaster scenario to test response procedures.

36
New cards

Validation Test

Involves running procedures and configurations in a test environment to confirm readiness.

37
New cards

Testing

Includes exercises and system tests to ensure DR plans are functional and complete.

38
New cards

Stacking

Combines multiple physical switches into a single logical switch to simplify management and increase reliability.

39
New cards

Diverse Paths (ISP Redundancy)

Using multiple ISPs or physical paths to avoid a single point of failure in connectivity.