SV

13-SW Dependability and Reliability

  • Dependability Importance

    • Critical property of computer-based systems.
    • Reflects user's trust in system operation and failure prevention.
    • Attributes: reliability, availability, safety, security, resilience.
    • Failures can have widespread negative impacts.
    • Rejection by users if system is undependable.
    • High costs associated with failures (economic loss, damage).
  • Dependability Properties

    • Availability: Probability system is operational and provides services.
    • Reliability: Probability of correct service delivery as expected.
    • Security: Ability to resist unauthorized or accidental intrusions.
    • Safety: Maintaining service continuity despite disruptive events (e.g., failures).
    • Resilience: Ability to recover from damages.
  • Attribute Dependencies

    • Safe operation relies on availability and reliability.
    • Unreliability can stem from data corruption or attacks.
  • Dependability Development

    • Avoid introduction of errors, effective verification and validation (V&V).
    • Design for fault tolerance and protection mechanisms.
    • Configure systems appropriately for their environment.
  • Cost of Dependability

    • Costs increase exponentially with higher dependability requirements.
    • Use of expensive techniques and extensive testing drives costs.
  • Economics of Dependability

    • Decision between trustworthiness and cost of failure.
    • Fault tolerance essential in critical systems (e.g., aviation).
  • Fault-tolerant Architectures

    • Based on redundancy and diversity to ensure reliability.
    • Redundant components provide backups; diverse components mitigate common failures.
  • Complexity and Redundancy

    • Increasing redundancy/completeness complicates system design, increasing error potential.
    • Simplicity advocated by some engineers for better dependability outcomes.
  • Verification and Validation Process

    • Utilize diverse approaches to prevent process errors.
    • Explicit, repeatable, and auditable processes are crucial.
  • Dependable Programming Guidelines

    1. Limit information visibility.
    2. Validate inputs, including bounds.
    3. Provide exception handling.
    4. Avoid complex code structures.
  • Faults and Errors

    • Clear definitions between human errors, system faults, system errors, and failures outlined.
    • Not all faults result in errors; not all errors cause failures.
  • Reliability

    • Defined as the probability of failure-free operation over time.
    • Perceived reliability can diverge from actual reliability.
  • Availability

    • Measures operational probability at a given time.
    • Perception of availability varies based on impact and duration of service disruptions.
  • Reliability Measurement

    • Metrics are crucial for assessing system reliability.
    • Include probabilities of failure on demand and mean time to failure.
  • Specification of Reliability

    • Necessary to clarify stakeholder needs and assess testing requirements.
    • Challenges in developing appropriate operational profiles and test data.
  • Achieving Dependability

    • Strategies include avoidance, resistance, detection, recognition, and recovery for enhancing dependability and security.
  • Conclusion

    • Focus on creating dependability in system architecture, programming, and validation processes to ensure high reliability and availability in critical systems.