Chapter 8: Diagnostics, Fault & Change Management Flashcards

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/21

flashcard set

Earn XP

Description and Tags

This set of vocabulary flashcards covers concepts from Chapter 8, including fault tolerance, network theory, causal analysis, probabilistic modeling, and change management principles.

Last updated 6:00 AM on 6/16/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

22 Terms

1
New cards

Fault Tolerance

The system feature that ensures continuous operation despite hardware, software, or network failures through mechanisms like redundancy.

2
New cards

Principle 43 (Predictable Failure)

The design principle stating that systems should fail predictably so they can be recovered quickly using standardized protocols and QA procedures.

3
New cards

Distributed Mesh Network

A network topology consisting of many redundant paths that provides the highest fault tolerance but is complex to administer.

4
New cards

Small-World Networks

Networks that are locally clustered yet maintain globally short distances, often characterized by 'Six Degrees' of separation.

5
New cards

Weak Links

Long-range short-cuts in a network that provide vital connectivity between distant systems but may accelerate fault propagation and security breaches.

6
New cards

Preferential Attachment

The tendency in scale-free networks for new nodes to connect to already popular nodes, leading to the accumulation of highly connected hubs.

7
New cards

Principle 44 (Causality)

The concept that every change or effect in a system happens in response to a cause that precedes it.

8
New cards

System Boundary

Not a physical box, but a probability sphere including all potential fault causes such as users, network, temperature, and third-party dependencies.

9
New cards

Principle 45 (Diagnostics)

The strategy to always eliminate the obvious first ('horses, not zebras'), focusing on simple causes like loose cables or wrong permissions.

10
New cards

Cause Tree

A visual Root Cause Analysis (RCA) tool that starts from an observed fault and branches into immediate causes and sub-causes.

11
New cards

Event Tree Analysis (ETA)

A documentation method that maps out every possible true/false pathway an event could take.

12
New cards

Primary Faults

Failures that occur when a component fails while operating strictly within its design limits, such as a server rated for 50tx/s50\,\text{tx/s} failing at 30tx/s30\,\text{tx/s}.

13
New cards

Secondary Faults

Failures that occur when a component is pushed outside its design specifications, such as a server rated for 50tx/s50\,\text{tx/s} failing at 90tx/s90\,\text{tx/s}.

14
New cards

Command Faults

Instances where a component functions correctly but is triggered at the wrong time or place, such as database queries firing with no user requests.

15
New cards

OR Gate (Probability)

A logic gate where probability grows, defined as P(A or B)=P(A)+P(B)P(AB)P(A \text{ or } B) = P(A) + P(B) - P(A \wedge B), making the system more dangerous.

16
New cards

AND Gate (Probability)

A logic gate where probability shrinks, defined as P(A and B)=P(A)×P(BA)P(A \text{ and } B) = P(A) \times P(B|A), making the system more secure.

17
New cards

XOR Gate (Probability)

A logic gate representing exclusive scenarios defined as P(A xor B)=P(A)+P(B)2P(AB)P(A \text{ xor } B) = P(A) + P(B) - 2P(A \wedge B), with no predictable safety direction.

18
New cards

Cutset

The minimal set of basic events required to trigger the top-level fault in fault tree analysis.

19
New cards

Change Management LOCK/UNLOCK

A procedure used during system updates to prevent partial reconfiguration hazards by restricting access while changes are made.

20
New cards

Principle 49 (Weakest Link)

The principle that system performance is limited by its weakest component, requiring optimization at the primary constraint first.

21
New cards

Principle 51 (Rapid Maintenance)

The assertion that speed of response is crucial because the environment changes constantly and quality is a journey rather than a destination.

22
New cards

SNMP Monitoring Tools

Software tools such as MRTG, RRDtool, and Cricket used for tracking machine performance and policy-level anomalies.