CS6340 Exam Notes - Software Analysis and Testing

Chapter 1: Introduction to Software Analysis

  • Program Analysis Overview
    • Definition: Techniques and tools that automatically discover useful properties or facts about computer programs.
    • Classification:
      • Dynamic Program Analysis (Run-Time):
        • Analyzing a program during its execution.
        • Example tools:
          • Purify: Detects memory access errors (e.g., array bounds violations).
          • Eraser: Identifies data races in multithreaded programs.
          • Valgrind: Detects memory leaks and use-after-free bugs, etc.
          • Daikon: Dynamically infers program invariants.
      • Static Program Analysis (Compile-Time):
        • Examines code without execution.
        • Example tools:
          • Lint, FindBugs, Coverity: Detect suspicious code patterns and possible bugs.
          • Infer (Facebook): Identifies memory leaks and null pointer issues in Android and iOS apps.
          • SLAM (Microsoft Research): Verifies Windows device drivers correctly use the Windows kernel API.
          • ESC/Java: Checks correctness properties (invariants) of Java programs.
      • Hybrid Program Analysis:
        • Combines static and dynamic analysis.
        • Benefits from accuracy of dynamic analysis and completeness of static analysis.
        • Example: Symbolic execution + test input generation (e.g., KLEE).
  • Program Invariants
    • A property or fact that holds true at a particular program point during all possible executions.
    • Example: A loop invariant like “the index variable i is always less than or equal to the array length”.
  • Key Concepts in Static Analysis
    • Control Flow Graph (CFG):
      • Representation of all possible execution paths.
      • Nodes = program statements or basic blocks.
      • Edges = possible control transfers.
    • Concrete vs. Abstract States:
      • Concrete state: Exact variable values at runtime (dynamic analysis).
      • Abstract state: Approximate/symbolic representations of values (static analysis).
    • Termination:
      • The analysis process should eventually halt.
      • Some analyses may be non-terminating or very expensive due to loops/recursion.
    • Completeness:
      • An analysis is complete if it finds all possible issues.
      • Static analysis is rarely complete because it relies on approximation.
    • Soundness:
      • An analysis is sound if all reported facts are guaranteed to be true across all program executions.
      • Tradeoff: Striving for soundness can lead to false positives (warnings about things that are not actually errors).
  • Iterative Approximation in Static Analysis
    • Most static analyses (especially data-flow analysis) use iterative approximation.
    • They revisit program points repeatedly (e.g., in loops) until reaching a fixed point (no further changes).
    • Example: Constant propagation, reaching definitions, live variable analysis.
  • Trade-offs in Program Analysis
    • Due to undecidability of general program properties (e.g., Halting Problem), we must balance:
      • Soundness: Trusting reported errors
      • Completeness: Trusting that nothing is missed
      • Termination: Guaranteeing the analysis eventually stops
    • Example:
      • Asking whether “a program point is reachable” for all inputs may be undecidable in general.
      • Tool designers often make trade-offs based on consumer needs (developer vs. verifier vs. optimizer).
  • Consumers of Program Analysis
    • Compilers:
      • Use analysis for:
        • Code optimization (e.g., dead code elimination)
        • Code generation (e.g., register allocation)
    • Software Quality Tools:
      • Perform:
        • Bug detection
        • Invariant verification
        • Test case generation
        • Root cause localization
      • Help improve reliability, safety, and maintainability.
    • Integrated Development Environments (IDEs):
      • Use analysis for:
        • Code navigation and understanding
        • Automated refactoring (e.g., renaming, extracting methods)
        • Real-time feedback (e.g., error highlighting)
  • Soundness vs. Completeness Revisited
    • Property Description Tradeoff Example
    • Soundness: If analysis says a program is correct, it definitely is. May produce false positives. May warn about memory leak even if it doesn’t happen.
    • Completeness: If analysis says a program is incorrect, it definitely is. May produce false negatives. May miss subtle race condition to avoid over-reporting.
    • False Positive: Analysis wrongly flags correct code as buggy.
    • False Negative: Analysis misses an actual issue in the code.

Chapter 2: Introduction to Software Testing

  • Why Specifications Matter in Testing
    • Core Reasons:
      1. Testing = Implementation vs. Specification: Testing checks whether the actual behavior matches the intended behavior.
      2. Without a Specification, Testing has no Grounding: No meaningful way to judge correctness without a target behavior to compare to.
      3. Testing as Consistency Checking: A recurring theme: Verify consistency between implementation and specification.
      4. Resource Constraints: We can’t test every input or behavior. Specs help prioritize tests for important or critical functionality.
      5. Evolving Specifications: As requirements change, tests must evolve to stay aligned with updated specifications.
    • Even if the same developer writes both the spec and the implementation, a simpler spec is less likely to contain the same mistake as the complex implementation.
  • Classification of Testing Approaches
    • Axes:
      • X-Axis: Black-Box ←→ White-Box
      • Y-Axis: Manual ←→ Automated
    • This leads to four quadrants (e.g., manual black-box, automated white-box, etc.), each with trade-offs.
  • Manual vs. Automated Testing
    • Aspect Manual Testing Automated Testing
    • Control Precise and human-judged Fully automated, potentially high speed
    • Efficiency Targeted, fewer but smarter tests Finds many bugs quickly, less effort per bug
    • Maintenance Test cases must be updated with code changes Test generation adapts with code, less upkeep
    • Coverage May be better if tests are well designed May miss subtle logic unless guided
    • Semi-Automated Testing is ideal: Humans provide high-level specifications, and tools generate tests from them.
  • Black-Box vs. White-Box Testing
    • Black-Box Testing:
      • Tests the external behavior of the software.
      • Does not require code access or knowledge of internal structure.
      • Language/format independent – even binary or obfuscated code can be tested.
      • Example: Testing a login form only by providing inputs and checking outputs, without reading the code.
    • White-Box Testing:
      • Leverages knowledge of internal code structure to design tests.
      • Helps with path coverage, checking all branches, loops, conditions, etc.
      • May generate a smaller but more effective test suite.
      • Example: Writing tests to cover all if-else branches in a sorting algorithm.
  • Pre-Conditions, Post-Conditions, and Frame Conditions
    • Pre-condition: What must be true before a function executes.
    • Post-condition: What must be true after the function executes, if the pre-condition held.
    • Example: Pre: stack.size() == n Post: stack.size() == n + 1 after a push() operation.
    • Frame Conditions: Assumptions about unchanged parts of the program.
      • Example: Stack ordering remains unchanged aside from the top element.
    • Pre/postconditions are more valuable if they are executable, such as using:
      • Assertions
      • Contracts in languages like Eiffel (require/ensure)
      • Unit testing frameworks with expected results
  • Test Suite Quality
    • Problem with Too Few or Too Many Tests:
      • Too Few: Miss important bugs.
      • Too Many: Hard to maintain, slow to execute, redundant.
    • Two Measures for Test Suite Effectiveness:
      1. Code Coverage Metrics:
        • Measures how much of the code has been exercised.
        • Common Types:
          • Function Coverage: Which functions were executed?
          • Statement Coverage: Which lines were run?
          • Branch Coverage: Were all decision branches taken?
          • Basic Block Coverage: Execution of straight-line code sequences
        • Note: High coverage ≠ high quality, but low coverage is often a red flag.
      2. Mutation Testing (Mutation Analysis):
        • Based on the “Competent Programmer Hypothesis”:
          • Real programs are close to correct, so small changes (mutants) are meaningful tests.
        • How it works:
          • Introduce small changes (mutants) to code (e.g., change x > 0 to x < 0)
          • Run tests: Good test suites will fail for mutants (i.e., "kill the mutants").
        • Goal: Design tests that distinguish original code from its mutants.
        • Key Challenge: Equivalent Mutants
          • Sometimes, a mutant behaves identically to the original for all inputs.
          • These are called equivalent mutants, and no test can detect the difference.
          • Difficult to identify or prove automatically – often requires manual effort.

Chapter 4: Automated test generation

  • Overview
    • This chapter discusses automated testing, moving from random fuzzing to systematic test generation using pre/post conditions and feedback-directed testing. It introduces two tools: Korat (for structured data like trees) and Randoop (for OO programs/libraries).
  • Korat: Systematic Test Generation
    • Goal
      • Automatically generate small, valid, and diverse test inputs satisfying pre-conditions.
    • Approach
      • Use small test case hypothesis: if a bug exists, small inputs can likely trigger it.
      • Represent data structures (like binary trees) as vectors of field values.
      • Exhaustively enumerate all input configurations up to size k, guided by pre-conditions.
    • Optimization Techniques
      • Instrument pre-conditions: Track fields accessed to avoid generating irrelevant inputs.
      • Use backtracking and field-by-field expansion to efficiently explore valid shapes.
      • Eliminate isomorphic structures to reduce redundancy.
    • Algorithm Steps
      1. User selects input size k.
      2. Generate all combinations of field values (shapes).
      3. Discard inputs that violate pre-condition.
      4. Run program and check post-condition.
    • Strengths
      • Best for linked structures: lists, trees, graphs.
      • Efficient for unit testing with strong invariants.
    • Limitations
      • Poor at handling primitive types (e.g., int, float).
      • Depends heavily on well-specified pre/post conditions.
  • Randoop: Feedback-Directed Random Testing
    • Goal
      • Automatically generate valid method call sequences that explore new object states and violate contracts if bugs exist.
    • Key Idea
      • Use execution feedback to guide generation of new test sequences.
    • Inputs
      • Classes under test
      • Time limit
      • Contracts (e.g., o.equals(o) == true, o.hashCode() doesn't throw)
    • Outputs
      • Failing test case (violates contract or causes exception)
    • Algorithm
      1. Start with seed components (basic values).
      2. Build method sequences incrementally.
      3. Execute each new sequence immediately.
      4. Based on execution:
        • Illegal? → Discard
        • Violates contract? → Report as test
        • Legal & novel? → Add to pool
        • Legal & redundant? → Discard
    • Redundancy Check
      • Compare new objects with existing ones using equals() or custom equivalence.
  • Comparison to Korat
    • Feature Korat Randoop
    • Deterministic Yes No
    • Focus Structured data (e.g., trees) OO APIs, method sequences
    • Input guidance Pre-condition driven Feedback driven
    • Requirements Strong typing (e.g., Java, UML) preferred over weakly typed languages like C/Lisp.
  • Key Insights
    • Automating test generation reduces effort, improves coverage, and finds deep bugs.
    • Using pre-conditions or runtime feedback is essential to avoid useless inputs.
    • Test generation is only as good as the specification (contracts, invariants).

Chapter 6: Pointer Analysis

  • Introduction

    • Pointer analysis is a form of dataflow analysis for non-primitive datatypes such as objects, pointers, and references. Unlike primitive types, pointers can reference the same memory location in multiple ways, making analysis more complex.
  • Pointer Aliasing

    • Pointer aliasing occurs when the same memory address is referred to through different variables.
      • May-Alias Analysis (aka Pointer Analysis):
        • Assumes that two variables may point to the same object. Initially, all pairs are assumed to alias, and analysis removes false pairs as it gathers more information.
      • Must-Alias Analysis:
        • Assumes two variables must point to the same object and builds constraints based on this. 🔁 These are dual problems.
      • ✅ May-Alias is the focus here — more practical though less precise.
      • ⚠ Must-Alias is more advanced, but less commonly used due to practicality issues.
  • Challenges and Approximations

    • Pointer analysis is undecidable: no algorithm can always be both complete and sound and terminate.
      • We sacrifice completeness: → May produce false positives (over-approximation), but no false negatives.
    • Example:
    if (x may point to obj1 or obj2) {
     assume both obj1 and obj2 are possible targets → leads to false positives
    }
    
  • Key Dimensions of Approximate Pointer Analysis

    1. Heap Abstraction
      • How to model dynamically allocated memory (heap).
    2. Control Flow Abstraction
      • How to model the order of program statements.
  • Heap Abstraction Techniques

    • Allocation-Site Based: One abstract object per allocation site (new, malloc).
      • ✅ Precise ❌ Costly for large programs.
    • Type-Based: One abstract object per type.
      • ✅ Efficient ❌ Less precise than allocation-site.
    • Heap-Insensitive: One abstract object for entire heap.
      • ✅ Very efficient ❌ Very imprecise
      • ✅ Common in C; ❌ Not suitable for Java
  • Precision: Allocation-Site > Type-Based > Heap-Insensitive

  • Efficiency: Allocation-Site < Type-Based < Heap-Insensitive

  • Control Flow Abstraction

    • Flow Sensitivity
      • Flow-Insensitive:
        • Ignores the order of statements.
        • Performs weak updates (accumulates new facts without removing old ones).
        • Sufficient for may-alias analysis.
      • Flow-Sensitive:
        • Considers the exact flow of control.
        • Performs strong updates (can remove old facts).
        • Needed for must-alias analysis.
  • Context Sensitivity

    • Context-Insensitive:
      • Analyzes each procedure once.
      • ✅ Efficient
    • Context-Sensitive:
      • Analyzes each procedure per abstract calling context.
      • ✅ More precise
      • ❌ More expensive
  • Abstracting Control Flow: Flow Insensitivity

for(int i = 0; i < M; i++) {
 Floor f = new Floor();
 v.floors[i] = f;
}

Rewritten as:

v.floors[*] = f;

This collapses multiple iterations into one non-deterministic write to any index.

Even though information is lost, it's useful for sound over-approximations.

  • Chaotic Iteration Algorithm
    • Objective: Build a Points-To Graph using chaotic iteration.
    • Initialization:
      • Start with an empty graph.
    • Repeat:
      • For each statement s, apply corresponding rule.
      • Update graph until no change occurs (fixpoint reached).
    • Grammar of Statements
      • These statements are enough to cover pointer analysis:
        • v = new … → Object allocation
        • v = v2 → Object copy
        • v2 = v.f → Field read
        • v.f = v2 → Field write
        • v2 = v[*] → Non-deterministic field read
        • v[*] = v2 → Non-deterministic field write
    • Example:
v.events[*] = e

Is broken into:

tmp = v.events;  // field read
tmp[*] = e;  // non-deterministic field write
  • Rules
    • Allocation: Different new or malloc sites give different abstract objects. If a variable can point to two allocations, we keep both (weak update).
    • Copy: If v1 = v2, then v1 gets all objects v2 points to.
    • Field Write:
      • v1.f = v2 → add edge from each object v1 points to, field f, to the objects v2 points to.
      • Existing field values are not deleted in flow-insensitive analysis.
    • Field Read:
      • v2 = v1.fv2 gets objects pointed to by field f of each object v1 points to.
  • Aggregate Data Types
    • Arrays
      • Use a single field [*] to represent all elements.
      • Loses ability to distinguish between elements.
      • Used in flow-insensitive pointer analysis.
    • Records (Structs / Classes)
      • Three approaches:
        1. Field-Insensitive: All fields of a record merged.
        2. Field-Based: Same-named fields across records are merged.
        3. Field-Sensitive: Most precise — each field of each object is separate.
      • Pointer analysis typically uses field-sensitive representation.
  • Summary: Classification of Pointer Analysis Algorithms
    • Dimension Choices
    • Flow Flow-sensitive / Flow-insensitive
    • Context Context-sensitive / insensitive
    • Heap Abstraction Allocation-based / Type-based / Heap-insensitive
    • Aggregate Modeling Field-sensitive / Field-based / Field-insensitive