CS6340 Exam Notes - Software Analysis and Testing

Chapter 1: Introduction to Software Analysis

Program Analysis Overview
- Definition: Techniques and tools that automatically discover useful properties or facts about computer programs.
- Classification:
  - Dynamic Program Analysis (Run-Time):
    - Analyzing a program during its execution.
    - Example tools:
      - Purify: Detects memory access errors (e.g., array bounds violations).
      - Eraser: Identifies data races in multithreaded programs.
      - Valgrind: Detects memory leaks and use-after-free bugs, etc.
      - Daikon: Dynamically infers program invariants.
  - Static Program Analysis (Compile-Time):
    - Examines code without execution.
    - Example tools:
      - Lint, FindBugs, Coverity: Detect suspicious code patterns and possible bugs.
      - Infer (Facebook): Identifies memory leaks and null pointer issues in Android and iOS apps.
      - SLAM (Microsoft Research): Verifies Windows device drivers correctly use the Windows kernel API.
      - ESC/Java: Checks correctness properties (invariants) of Java programs.
  - Hybrid Program Analysis:
    - Combines static and dynamic analysis.
    - Benefits from accuracy of dynamic analysis and completeness of static analysis.
    - Example: Symbolic execution + test input generation (e.g., KLEE).
Program Invariants
- A property or fact that holds true at a particular program point during all possible executions.
- Example: A loop invariant like “the index variable i is always less than or equal to the array length”.
Key Concepts in Static Analysis
- Control Flow Graph (CFG):
  - Representation of all possible execution paths.
  - Nodes = program statements or basic blocks.
  - Edges = possible control transfers.
- Concrete vs. Abstract States:
  - Concrete state: Exact variable values at runtime (dynamic analysis).
  - Abstract state: Approximate/symbolic representations of values (static analysis).
- Termination:
  - The analysis process should eventually halt.
  - Some analyses may be non-terminating or very expensive due to loops/recursion.
- Completeness:
  - An analysis is complete if it finds all possible issues.
  - Static analysis is rarely complete because it relies on approximation.
- Soundness:
  - An analysis is sound if all reported facts are guaranteed to be true across all program executions.
  - Tradeoff: Striving for soundness can lead to false positives (warnings about things that are not actually errors).
Iterative Approximation in Static Analysis
- Most static analyses (especially data-flow analysis) use iterative approximation.
- They revisit program points repeatedly (e.g., in loops) until reaching a fixed point (no further changes).
- Example: Constant propagation, reaching definitions, live variable analysis.
Trade-offs in Program Analysis
- Due to undecidability of general program properties (e.g., Halting Problem), we must balance:
  - Soundness: Trusting reported errors
  - Completeness: Trusting that nothing is missed
  - Termination: Guaranteeing the analysis eventually stops
- Example:
  - Asking whether “a program point is reachable” for all inputs may be undecidable in general.
  - Tool designers often make trade-offs based on consumer needs (developer vs. verifier vs. optimizer).
Consumers of Program Analysis
- Compilers:
  - Use analysis for:
    - Code optimization (e.g., dead code elimination)
    - Code generation (e.g., register allocation)
- Software Quality Tools:
  - Perform:
    - Bug detection
    - Invariant verification
    - Test case generation
    - Root cause localization
  - Help improve reliability, safety, and maintainability.
- Integrated Development Environments (IDEs):
  - Use analysis for:
    - Code navigation and understanding
    - Automated refactoring (e.g., renaming, extracting methods)
    - Real-time feedback (e.g., error highlighting)
Soundness vs. Completeness Revisited
- Property Description Tradeoff Example
- Soundness: If analysis says a program is correct, it definitely is. May produce false positives. May warn about memory leak even if it doesn’t happen.
- Completeness: If analysis says a program is incorrect, it definitely is. May produce false negatives. May miss subtle race condition to avoid over-reporting.
- False Positive: Analysis wrongly flags correct code as buggy.
- False Negative: Analysis misses an actual issue in the code.

Chapter 2: Introduction to Software Testing

Why Specifications Matter in Testing
- Core Reasons:
  1. Testing = Implementation vs. Specification: Testing checks whether the actual behavior matches the intended behavior.
  2. Without a Specification, Testing has no Grounding: No meaningful way to judge correctness without a target behavior to compare to.
  3. Testing as Consistency Checking: A recurring theme: Verify consistency between implementation and specification.
  4. Resource Constraints: We can’t test every input or behavior. Specs help prioritize tests for important or critical functionality.
  5. Evolving Specifications: As requirements change, tests must evolve to stay aligned with updated specifications.
- Even if the same developer writes both the spec and the implementation, a simpler spec is less likely to contain the same mistake as the complex implementation.
Classification of Testing Approaches
- Axes:
  - X-Axis: Black-Box ←→ White-Box
  - Y-Axis: Manual ←→ Automated
- This leads to four quadrants (e.g., manual black-box, automated white-box, etc.), each with trade-offs.
Manual vs. Automated Testing
- Aspect Manual Testing Automated Testing
- Control Precise and human-judged Fully automated, potentially high speed
- Efficiency Targeted, fewer but smarter tests Finds many bugs quickly, less effort per bug
- Maintenance Test cases must be updated with code changes Test generation adapts with code, less upkeep
- Coverage May be better if tests are well designed May miss subtle logic unless guided
- Semi-Automated Testing is ideal: Humans provide high-level specifications, and tools generate tests from them.
Black-Box vs. White-Box Testing
- Black-Box Testing:
  - Tests the external behavior of the software.
  - Does not require code access or knowledge of internal structure.
  - Language/format independent – even binary or obfuscated code can be tested.
  - Example: Testing a login form only by providing inputs and checking outputs, without reading the code.
- White-Box Testing:
  - Leverages knowledge of internal code structure to design tests.
  - Helps with path coverage, checking all branches, loops, conditions, etc.
  - May generate a smaller but more effective test suite.
  - Example: Writing tests to cover all if-else branches in a sorting algorithm.
Pre-Conditions, Post-Conditions, and Frame Conditions
- Pre-condition: What must be true before a function executes.
- Post-condition: What must be true after the function executes, if the pre-condition held.
- Example: Pre: stack.size() == n Post: stack.size() == n + 1 after a push() operation.
- Frame Conditions: Assumptions about unchanged parts of the program.
  - Example: Stack ordering remains unchanged aside from the top element.
- Pre/postconditions are more valuable if they are executable, such as using:
  - Assertions
  - Contracts in languages like Eiffel (require/ensure)
  - Unit testing frameworks with expected results
Test Suite Quality
- Problem with Too Few or Too Many Tests:
  - Too Few: Miss important bugs.
  - Too Many: Hard to maintain, slow to execute, redundant.
- Two Measures for Test Suite Effectiveness:
  1. Code Coverage Metrics:
    - Measures how much of the code has been exercised.
    - Common Types:
      - Function Coverage: Which functions were executed?
      - Statement Coverage: Which lines were run?
      - Branch Coverage: Were all decision branches taken?
      - Basic Block Coverage: Execution of straight-line code sequences
    - Note: High coverage ≠ high quality, but low coverage is often a red flag.
  2. Mutation Testing (Mutation Analysis):
    - Based on the “Competent Programmer Hypothesis”:
      - Real programs are close to correct, so small changes (mutants) are meaningful tests.
    - How it works:
      - Introduce small changes (mutants) to code (e.g., change x > 0 to x < 0)
      - Run tests: Good test suites will fail for mutants (i.e., "kill the mutants").
    - Goal: Design tests that distinguish original code from its mutants.
    - Key Challenge: Equivalent Mutants
      - Sometimes, a mutant behaves identically to the original for all inputs.
      - These are called equivalent mutants, and no test can detect the difference.
      - Difficult to identify or prove automatically – often requires manual effort.

Chapter 4: Automated test generation

Overview
- This chapter discusses automated testing, moving from random fuzzing to systematic test generation using pre/post conditions and feedback-directed testing. It introduces two tools: Korat (for structured data like trees) and Randoop (for OO programs/libraries).
Korat: Systematic Test Generation
- Goal
  - Automatically generate small, valid, and diverse test inputs satisfying pre-conditions.
- Approach
  - Use small test case hypothesis: if a bug exists, small inputs can likely trigger it.
  - Represent data structures (like binary trees) as vectors of field values.
  - Exhaustively enumerate all input configurations up to size k, guided by pre-conditions.
- Optimization Techniques
  - Instrument pre-conditions: Track fields accessed to avoid generating irrelevant inputs.
  - Use backtracking and field-by-field expansion to efficiently explore valid shapes.
  - Eliminate isomorphic structures to reduce redundancy.
- Algorithm Steps
  1. User selects input size k.
  2. Generate all combinations of field values (shapes).
  3. Discard inputs that violate pre-condition.
  4. Run program and check post-condition.
- Strengths
  - Best for linked structures: lists, trees, graphs.
  - Efficient for unit testing with strong invariants.
- Limitations
  - Poor at handling primitive types (e.g., int, float).
  - Depends heavily on well-specified pre/post conditions.
Randoop: Feedback-Directed Random Testing
- Goal
  - Automatically generate valid method call sequences that explore new object states and violate contracts if bugs exist.
- Key Idea
  - Use execution feedback to guide generation of new test sequences.
- Inputs
  - Classes under test
  - Time limit
  - Contracts (e.g., o.equals(o) == true, o.hashCode() doesn't throw)
- Outputs
  - Failing test case (violates contract or causes exception)
- Algorithm
  1. Start with seed components (basic values).
  2. Build method sequences incrementally.
  3. Execute each new sequence immediately.
  4. Based on execution:
    - Illegal? → Discard
    - Violates contract? → Report as test
    - Legal & novel? → Add to pool
    - Legal & redundant? → Discard
- Redundancy Check
  - Compare new objects with existing ones using equals() or custom equivalence.
Comparison to Korat
- Feature Korat Randoop
- Deterministic Yes No
- Focus Structured data (e.g., trees) OO APIs, method sequences
- Input guidance Pre-condition driven Feedback driven
- Requirements Strong typing (e.g., Java, UML) preferred over weakly typed languages like C/Lisp.
Key Insights
- Automating test generation reduces effort, improves coverage, and finds deep bugs.
- Using pre-conditions or runtime feedback is essential to avoid useless inputs.
- Test generation is only as good as the specification (contracts, invariants).

Chapter 6: Pointer Analysis

Introduction
- Pointer analysis is a form of dataflow analysis for non-primitive datatypes such as objects, pointers, and references. Unlike primitive types, pointers can reference the same memory location in multiple ways, making analysis more complex.
Pointer Aliasing
- Pointer aliasing occurs when the same memory address is referred to through different variables.
  - May-Alias Analysis (aka Pointer Analysis):
    - Assumes that two variables may point to the same object. Initially, all pairs are assumed to alias, and analysis removes false pairs as it gathers more information.
  - Must-Alias Analysis:
    - Assumes two variables must point to the same object and builds constraints based on this. 🔁 These are dual problems.
  - ✅ May-Alias is the focus here — more practical though less precise.
  - ⚠ Must-Alias is more advanced, but less commonly used due to practicality issues.
Challenges and Approximations
- Pointer analysis is undecidable: no algorithm can always be both complete and sound and terminate.
  - We sacrifice completeness: → May produce false positives (over-approximation), but no false negatives.
- Example:
```
if (x may point to obj1 or obj2) {
 assume both obj1 and obj2 are possible targets → leads to false positives
}
```
Key Dimensions of Approximate Pointer Analysis
1. Heap Abstraction
  - How to model dynamically allocated memory (heap).
2. Control Flow Abstraction
  - How to model the order of program statements.
Heap Abstraction Techniques
- Allocation-Site Based: One abstract object per allocation site (new, malloc).
  - ✅ Precise ❌ Costly for large programs.
- Type-Based: One abstract object per type.
  - ✅ Efficient ❌ Less precise than allocation-site.
- Heap-Insensitive: One abstract object for entire heap.
  - ✅ Very efficient ❌ Very imprecise
  - ✅ Common in C; ❌ Not suitable for Java
Precision: Allocation-Site > Type-Based > Heap-Insensitive
Efficiency: Allocation-Site < Type-Based < Heap-Insensitive
Control Flow Abstraction
- Flow Sensitivity
  - Flow-Insensitive:
    - Ignores the order of statements.
    - Performs weak updates (accumulates new facts without removing old ones).
    - Sufficient for may-alias analysis.
  - Flow-Sensitive:
    - Considers the exact flow of control.
    - Performs strong updates (can remove old facts).
    - Needed for must-alias analysis.
Context Sensitivity
- Context-Insensitive:
  - Analyzes each procedure once.
  - ✅ Efficient
- Context-Sensitive:
  - Analyzes each procedure per abstract calling context.
  - ✅ More precise
  - ❌ More expensive
Abstracting Control Flow: Flow Insensitivity

for(int i = 0; i < M; i++) {
 Floor f = new Floor();
 v.floors[i] = f;
}

Rewritten as:

v.floors[*] = f;

This collapses multiple iterations into one non-deterministic write to any index.

Even though information is lost, it's useful for sound over-approximations.

Chaotic Iteration Algorithm
- Objective: Build a Points-To Graph using chaotic iteration.
- Initialization:
  - Start with an empty graph.
- Repeat:
  - For each statement s, apply corresponding rule.
  - Update graph until no change occurs (fixpoint reached).
- Grammar of Statements
  - These statements are enough to cover pointer analysis:
    - v = new … → Object allocation
    - v = v2 → Object copy
    - v2 = v.f → Field read
    - v.f = v2 → Field write
    - v2 = v[*] → Non-deterministic field read
    - v[*] = v2 → Non-deterministic field write
- Example:

v.events[*] = e

Is broken into:

tmp = v.events;  // field read
tmp[*] = e;  // non-deterministic field write

Rules
- Allocation: Different new or malloc sites give different abstract objects. If a variable can point to two allocations, we keep both (weak update).
- Copy: If v1 = v2, then v1 gets all objects v2 points to.
- Field Write:
  - v1.f = v2 → add edge from each object v1 points to, field f, to the objects v2 points to.
  - Existing field values are not deleted in flow-insensitive analysis.
- Field Read:
  - v2 = v1.f → v2 gets objects pointed to by field f of each object v1 points to.
Aggregate Data Types
- Arrays
  - Use a single field [*] to represent all elements.
  - Loses ability to distinguish between elements.
  - Used in flow-insensitive pointer analysis.
- Records (Structs / Classes)
  - Three approaches:
    1. Field-Insensitive: All fields of a record merged.
    2. Field-Based: Same-named fields across records are merged.
    3. Field-Sensitive: Most precise — each field of each object is separate.
  - Pointer analysis typically uses field-sensitive representation.
Summary: Classification of Pointer Analysis Algorithms
- Dimension Choices
- Flow Flow-sensitive / Flow-insensitive
- Context Context-sensitive / insensitive
- Heap Abstraction Allocation-based / Type-based / Heap-insensitive
- Aggregate Modeling Field-sensitive / Field-based / Field-insensitive