Refactoring & Program Comprehension – Comprehensive Study Notes

7.1 General Idea

  • Refactoring / restructuring = systematic improvement of internal & external software qualities.
  • Motivations for managers & developers
    • Better understandability ⇒ easier maintenance, onboarding, knowledge-transfer.
    • Ability to adopt new structures / paradigms as tech evolves.
    • Higher reliability & lower defect density.
    • Longer product life-span.
    • Enabler of automated analyses (e.g., static checkers work better on clean code).
  • Core characteristics
    • Goal: enhance quality without changing required behaviour.
    • External behaviour must be preserved (same I/O for same inputs).
    • No new functional requirements are added.
    • Output artefact remains in the same PL (e.g., CCC \rightarrow C).

7.2 Activities in a Refactoring Process

  • Process = ordered set of six activities.
    1. Identify what to refactor.
    2. Decide which refactorings to apply.
    3. Ensure behaviour preservation.
    4. Apply chosen refactorings.
    5. Evaluate quality impacts.
    6. Maintain consistency across artefacts.
7.2.1 Identify What to Refactor
  • Initial artefact set that may contain smells
    • Source code.
    • Design docs.
    • Requirements docs.
  • Narrow down to specific entities (modules, classes, methods, data, etc.).
  • Code-smell concept = tangible symptom indicating deeper design problems.
    • Duplicate code.
    • Long parameter lists.
    • Long methods.
    • Large classes.
    • Message chains.
  • Design-level entities with potential smells
    • Software architecture, class/statechart/activity diagrams.
    • Global control-flow.
    • DB schemas.
7.2.2 Determine Which Refactorings to Apply
  • Tooling essential to compute feasible subset of candidate refactorings.
  • Two formal analyses
    • Critical Pair Analysis → detect conflicts; two refactorings form a conflict pair if they cannot coexist.
    • Sequential Dependency Analysis → identifies prerequisite chains; if refactoring AA must precede BB then BB depends on AA.
  • Mutual exclusion: once one refactoring applied, an exclusive alternative becomes invalid.
7.2.3 Ensure Behaviour Preservation
  • Ideal guarantee: identical I/O behaviour pre- and post-refactor.
  • Must also safeguard non-functional requirements (NFRs)
    • Temporal constraints (correct ordering in RT systems).
    • Resource constraints (no additional memory, energy, bandwidth, …).
    • Safety constraints (retain safety properties).
  • Pragmatic validation techniques
    • Exhaustive / regression testing before vs. after; compare outputs test-by-test.
    • Verification of preserved call sequences (ensure critical method-call orders stay intact).
7.2.4 Apply Refactorings (implicit in slides, included for completeness)
  • Perform atomic refactoring steps by IDE or scripts.
  • Continuously run automated test-suite.
7.2.5 Evaluate Impacts on Quality
  • Internal quality attributes: size, complexity, coupling, cohesion, testability.
  • External quality attributes: performance, reusability, maintainability, extensibility, robustness, scalability.
  • Each refactoring typically targets a small subset of attributes.
    • E.g., Extract Method → eliminates duplication; improves reuse.
    • Inline Temp → may boost performance.
  • Use metrics as proxies for external quality
    • ↓ Coupling, ↑ Cohesion, ↓ LOC ⇒ higher maintainability.
    • Compare metrics Before\text{Before} vs. After\text{After} to quantify benefit.

7.5 Initial Work on Software Restructuring

  • Origins in mid-1960s Fortran community.
  • Discussion topics
    1. Factors influencing software structure.
    2. Classification of restructuring approaches.
    3. Concrete techniques.
    • Elimination-of-goto.
    • Localization & information hiding.
    • System sandwich.
    • Clustering.
    • Program slicing.
7.5.1 Factors Influencing Software Structure
  • Software structure = attributes allowing developers to build mental model quickly.
  • Influence categories (Fig. 7.9)
    • Code ⇆ Documentation.
    • Tools ⇆ Programmers.
    • Managers & policies ⇆ Environment.
Code
  • Quality at all granularities (variables → modules) affects comprehension.
  • Coding standards & architectural styles enhance clarity.
Documentation
  • Internal (inline) vs. external.
    • Requirements, design docs, user manuals, test cases.
Tools (Programming Environment)
  • Support program understanding via:
    • Source tracing & run-time visualization.
    • Algorithm animation.
    • Global variable cross-reference.
    • Pretty printing, syntax coloring.
Programmers
  • Individual attributes: capability, education, experience, aptitude.
Managers & Policies
  • Provide resources & enforce standards.
    • E.g., performance review linked to code-quality adherence.
Environment
  • Physical facilities, resource availability, overall workplace conditions.

Chapter 8 Program Comprehension (Link to Refactoring)

8.1 General Idea
  • Sound comprehension is prerequisite for safe maintenance & evolution.
  • Poor mental models → degraded reliability/performance.
  • Five maintenance task categories (Table 8.1): Adaptive, Perfective, Corrective, Reuse, Code Coverage – all start with Understand system/problem.
8.2 Basic Terms
  • Code cognition models (Table 8.2) classify how maintainers understand code.
    • Control-flow, Functional, Top-down, Integrated, etc.
  • Key foundational terms
    • Goal of code cognition.
    • Knowledge (general vs. software-specific).
    • Mental model.
8.2.1 Goal of Code Cognition
  • Comprehension driven by concrete objective (debugging, adding feature, etc.).
  • Determines scope (whole program vs. sub-set).
  • Viewed as knowledge-acquisition process.
8.2.2 Knowledge
  • General Knowledge
    • Algorithms/data structures, OS, PLs, SW design, testing.
  • Software-Specific Knowledge
    • Architecture style (e.g., 3-tier), encryption choice, loops, variable semantics, etc.
  • Knowledge acquisition is iterative between the two (Fig. 8.1).
8.2.3 Mental Model
  • Programmer’s internal representation; varies among individuals.
  • Built from static & dynamic elements.

Static Elements

  • Text-structures: loops, sequences, conditionals, call hierarchies, variable definitions.
  • Chunks: contiguous related code segments enabling higher abstraction.
  • Schemas/Plans: generic knowledge structures (e.g., doubly linked list schema with slot-types & slot-fillers).
  • Hypotheses (Why / How / What) → continuous conjecture & verification cycle.

Dynamic Elements

  • Chunking: recursive grouping → higher-level labels representing functionality.
  • Cross-referencing: linking different abstraction levels (e.g., data-flow ↔ high-level functionality).
  • Strategies: planned sequences to reach comprehension goal; guide chunking & cross-referencing.
8.2.4 Understanding Code
  • Influencing factors
    1. Knowledge extraction from code (beacons + rules of programming discourse).
    • Beacon examples: swap(), sort(), startTimer().
    • Discourse rules: meaningful names, function does what name says.
    1. Reader expertise level
    • Experts: organize by functionality/algorithms, breadth-first then details, possess design schemas.
    • Novices: focus on syntax.
8.3 Cognition Models for Program Understanding
  • Letovsky model.
  • Shneiderman & Mayer model.
  • Brooks model.
  • Soloway, Adelson, & Ehrlich top-down model.
  • Pennington bottom-up model.
  • Integrated metamodel (combines perspectives).

Cross-Connections & Practical Implications

  • Refactoring heavily depends on accurate program comprehension ⇒ Chapter 8 concepts underpin Chapter 7 process.
  • Code smells often detected via expert mental models, beacons, and schemas.
  • Behaviour preservation relies on mental models of control/data-flow & NFR understanding.
  • Quality metrics guide both initial smell detection and post-refactor evaluation.
  • Cognitive strategies inform tooling (e.g., IDEs highlight chunks, offer call-sequence views).