Refactoring & Program Comprehension – Comprehensive Study Notes
7.1 General Idea
- Refactoring / restructuring = systematic improvement of internal & external software qualities.
- Motivations for managers & developers
- Better understandability ⇒ easier maintenance, onboarding, knowledge-transfer.
- Ability to adopt new structures / paradigms as tech evolves.
- Higher reliability & lower defect density.
- Longer product life-span.
- Enabler of automated analyses (e.g., static checkers work better on clean code).
- Core characteristics
- Goal: enhance quality without changing required behaviour.
- External behaviour must be preserved (same I/O for same inputs).
- No new functional requirements are added.
- Output artefact remains in the same PL (e.g., ).
7.2 Activities in a Refactoring Process
- Process = ordered set of six activities.
- Identify what to refactor.
- Decide which refactorings to apply.
- Ensure behaviour preservation.
- Apply chosen refactorings.
- Evaluate quality impacts.
- Maintain consistency across artefacts.
7.2.1 Identify What to Refactor
- Initial artefact set that may contain smells
- Source code.
- Design docs.
- Requirements docs.
- Narrow down to specific entities (modules, classes, methods, data, etc.).
- Code-smell concept = tangible symptom indicating deeper design problems.
- Duplicate code.
- Long parameter lists.
- Long methods.
- Large classes.
- Message chains.
- Design-level entities with potential smells
- Software architecture, class/statechart/activity diagrams.
- Global control-flow.
- DB schemas.
7.2.2 Determine Which Refactorings to Apply
- Tooling essential to compute feasible subset of candidate refactorings.
- Two formal analyses
- Critical Pair Analysis → detect conflicts; two refactorings form a conflict pair if they cannot coexist.
- Sequential Dependency Analysis → identifies prerequisite chains; if refactoring must precede then depends on .
- Mutual exclusion: once one refactoring applied, an exclusive alternative becomes invalid.
7.2.3 Ensure Behaviour Preservation
- Ideal guarantee: identical I/O behaviour pre- and post-refactor.
- Must also safeguard non-functional requirements (NFRs)
- Temporal constraints (correct ordering in RT systems).
- Resource constraints (no additional memory, energy, bandwidth, …).
- Safety constraints (retain safety properties).
- Pragmatic validation techniques
- Exhaustive / regression testing before vs. after; compare outputs test-by-test.
- Verification of preserved call sequences (ensure critical method-call orders stay intact).
7.2.4 Apply Refactorings (implicit in slides, included for completeness)
- Perform atomic refactoring steps by IDE or scripts.
- Continuously run automated test-suite.
7.2.5 Evaluate Impacts on Quality
- Internal quality attributes: size, complexity, coupling, cohesion, testability.
- External quality attributes: performance, reusability, maintainability, extensibility, robustness, scalability.
- Each refactoring typically targets a small subset of attributes.
- E.g., Extract Method → eliminates duplication; improves reuse.
- Inline Temp → may boost performance.
- Use metrics as proxies for external quality
- ↓ Coupling, ↑ Cohesion, ↓ LOC ⇒ higher maintainability.
- Compare metrics vs. to quantify benefit.
7.5 Initial Work on Software Restructuring
- Origins in mid-1960s Fortran community.
- Discussion topics
- Factors influencing software structure.
- Classification of restructuring approaches.
- Concrete techniques.
- Elimination-of-goto.
- Localization & information hiding.
- System sandwich.
- Clustering.
- Program slicing.
7.5.1 Factors Influencing Software Structure
- Software structure = attributes allowing developers to build mental model quickly.
- Influence categories (Fig. 7.9)
- Code ⇆ Documentation.
- Tools ⇆ Programmers.
- Managers & policies ⇆ Environment.
Code
- Quality at all granularities (variables → modules) affects comprehension.
- Coding standards & architectural styles enhance clarity.
Documentation
- Internal (inline) vs. external.
- Requirements, design docs, user manuals, test cases.
Tools (Programming Environment)
- Support program understanding via:
- Source tracing & run-time visualization.
- Algorithm animation.
- Global variable cross-reference.
- Pretty printing, syntax coloring.
Programmers
- Individual attributes: capability, education, experience, aptitude.
Managers & Policies
- Provide resources & enforce standards.
- E.g., performance review linked to code-quality adherence.
Environment
- Physical facilities, resource availability, overall workplace conditions.
Chapter 8 Program Comprehension (Link to Refactoring)
8.1 General Idea
- Sound comprehension is prerequisite for safe maintenance & evolution.
- Poor mental models → degraded reliability/performance.
- Five maintenance task categories (Table 8.1): Adaptive, Perfective, Corrective, Reuse, Code Coverage – all start with Understand system/problem.
8.2 Basic Terms
- Code cognition models (Table 8.2) classify how maintainers understand code.
- Control-flow, Functional, Top-down, Integrated, etc.
- Key foundational terms
- Goal of code cognition.
- Knowledge (general vs. software-specific).
- Mental model.
8.2.1 Goal of Code Cognition
- Comprehension driven by concrete objective (debugging, adding feature, etc.).
- Determines scope (whole program vs. sub-set).
- Viewed as knowledge-acquisition process.
8.2.2 Knowledge
- General Knowledge
- Algorithms/data structures, OS, PLs, SW design, testing.
- Software-Specific Knowledge
- Architecture style (e.g., 3-tier), encryption choice, loops, variable semantics, etc.
- Knowledge acquisition is iterative between the two (Fig. 8.1).
8.2.3 Mental Model
- Programmer’s internal representation; varies among individuals.
- Built from static & dynamic elements.
Static Elements
- Text-structures: loops, sequences, conditionals, call hierarchies, variable definitions.
- Chunks: contiguous related code segments enabling higher abstraction.
- Schemas/Plans: generic knowledge structures (e.g., doubly linked list schema with slot-types & slot-fillers).
- Hypotheses (Why / How / What) → continuous conjecture & verification cycle.
Dynamic Elements
- Chunking: recursive grouping → higher-level labels representing functionality.
- Cross-referencing: linking different abstraction levels (e.g., data-flow ↔ high-level functionality).
- Strategies: planned sequences to reach comprehension goal; guide chunking & cross-referencing.
8.2.4 Understanding Code
- Influencing factors
- Knowledge extraction from code (beacons + rules of programming discourse).
- Beacon examples: swap(), sort(), startTimer().
- Discourse rules: meaningful names, function does what name says.
- Reader expertise level
- Experts: organize by functionality/algorithms, breadth-first then details, possess design schemas.
- Novices: focus on syntax.
8.3 Cognition Models for Program Understanding
- Letovsky model.
- Shneiderman & Mayer model.
- Brooks model.
- Soloway, Adelson, & Ehrlich top-down model.
- Pennington bottom-up model.
- Integrated metamodel (combines perspectives).
Cross-Connections & Practical Implications
- Refactoring heavily depends on accurate program comprehension ⇒ Chapter 8 concepts underpin Chapter 7 process.
- Code smells often detected via expert mental models, beacons, and schemas.
- Behaviour preservation relies on mental models of control/data-flow & NFR understanding.
- Quality metrics guide both initial smell detection and post-refactor evaluation.
- Cognitive strategies inform tooling (e.g., IDEs highlight chunks, offer call-sequence views).