1/21
This set of vocabulary flashcards covers concepts from Chapter 8, including fault tolerance, network theory, causal analysis, probabilistic modeling, and change management principles.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Fault Tolerance
The system feature that ensures continuous operation despite hardware, software, or network failures through mechanisms like redundancy.
Principle 43 (Predictable Failure)
The design principle stating that systems should fail predictably so they can be recovered quickly using standardized protocols and QA procedures.
Distributed Mesh Network
A network topology consisting of many redundant paths that provides the highest fault tolerance but is complex to administer.
Small-World Networks
Networks that are locally clustered yet maintain globally short distances, often characterized by 'Six Degrees' of separation.
Weak Links
Long-range short-cuts in a network that provide vital connectivity between distant systems but may accelerate fault propagation and security breaches.
Preferential Attachment
The tendency in scale-free networks for new nodes to connect to already popular nodes, leading to the accumulation of highly connected hubs.
Principle 44 (Causality)
The concept that every change or effect in a system happens in response to a cause that precedes it.
System Boundary
Not a physical box, but a probability sphere including all potential fault causes such as users, network, temperature, and third-party dependencies.
Principle 45 (Diagnostics)
The strategy to always eliminate the obvious first ('horses, not zebras'), focusing on simple causes like loose cables or wrong permissions.
Cause Tree
A visual Root Cause Analysis (RCA) tool that starts from an observed fault and branches into immediate causes and sub-causes.
Event Tree Analysis (ETA)
A documentation method that maps out every possible true/false pathway an event could take.
Primary Faults
Failures that occur when a component fails while operating strictly within its design limits, such as a server rated for 50tx/s failing at 30tx/s.
Secondary Faults
Failures that occur when a component is pushed outside its design specifications, such as a server rated for 50tx/s failing at 90tx/s.
Command Faults
Instances where a component functions correctly but is triggered at the wrong time or place, such as database queries firing with no user requests.
OR Gate (Probability)
A logic gate where probability grows, defined as P(A or B)=P(A)+P(B)−P(A∧B), making the system more dangerous.
AND Gate (Probability)
A logic gate where probability shrinks, defined as P(A and B)=P(A)×P(B∣A), making the system more secure.
XOR Gate (Probability)
A logic gate representing exclusive scenarios defined as P(A xor B)=P(A)+P(B)−2P(A∧B), with no predictable safety direction.
Cutset
The minimal set of basic events required to trigger the top-level fault in fault tree analysis.
Change Management LOCK/UNLOCK
A procedure used during system updates to prevent partial reconfiguration hazards by restricting access while changes are made.
Principle 49 (Weakest Link)
The principle that system performance is limited by its weakest component, requiring optimization at the primary constraint first.
Principle 51 (Rapid Maintenance)
The assertion that speed of response is crucial because the environment changes constantly and quality is a journey rather than a destination.
SNMP Monitoring Tools
Software tools such as MRTG, RRDtool, and Cricket used for tracking machine performance and policy-level anomalies.