Compiler Optimization Techniques: Dominators, Dead Code Elimination, Dataflow Analysis, SSA

Dominators

  • Dominator Concept

    • Definition: A node D is said to dominate node N if every path from the entry node to N must go through D.
    • Notation: D dominates N can be noted as: D \prec N.
    • Immediate Dominator (IDOM):
    • Definition: IDOM(N) is the closest dominator of N that is not N itself.
    • Example:
      • If the dominators are given as:
      • A B C D E F G H
      • Then the immediate dominators would be: Ø A B B D D D B C.
  • Dominator Tree

    • A representation of the relationship between nodes in terms of their dominators.
    • Structure: Each node points to its immediate dominator, forming a tree structure.
  • Dominance Frontier (DF)

    • Definition: The dominance frontier of a node D is the set of nodes that are not dominated by D, but are first reachable from D along some control flow path.
    • Significance: Indicates where the guarantee of dominance stops.
  • Post-Dominance

    • Node J post-dominates node I if every path that exits I must go through J.
    • Post-Dominance Frontier (PDF): Defines where post-dominance conditions apply.
    • Control Dependence: Node J is control dependent on node I if I EPDF(J).

Dead Code Elimination

  • Definition: The process of removing code that does not affect the program's outcome.

  • Types of Dead Code

    • Useless (dead): Unused operations within the code.
    • Unreachable: Parts of the code with no valid control flow path leading to them.
  • Identifying Useful Operations

    • Mark-Sweep Algorithm:
    • Used for identifying useful operations by considering branches in the flow of control.
    • Mark Phase:
      • For each operation op, if op is critical, it is marked as useful.
      • Add op to a worklist.
      • While the worklist is not empty:
      • Remove op from worklist.
      • If Def(y) is not marked, mark Def(y) as useful and add it to the worklist.
      • If Def(z) is not marked, mark Def(z) as useful and add it to the worklist.
      • For each block b in PDF(op), mark the ending branch in b and add to the worklist.
    • Sweep Phase:
      • For each operation op, if op is not marked and is a branch:
      • Replace op with a jump to its nearest useful post-dominator.
      • If not a branch, delete op.
  • Critical Operations:

    • Usually include store operations, jumps, branches.
  • Mark-Sweep Algorithm Simplified:

    • For simplifying operations within the Mark-Sweep process with example operations demonstrated.

Dataflow Analysis

  • Purpose: To gather information about the flow of data within programs.

  • Key Concepts:

    • Availability: To check if an expression has been computed already.
    • Liveliness: To determine if a variable will be needed in the future.
    • Reach Definition: Find out which definitions will reach the point of usage without being overwritten.
  • Data Flow Properties:

    • IN, OUT sets, GEN/Kill are essential structures in Dataflow Analysis.
    • Changes in direction based on Forward vs Backward analysis.
  • Forward Analysis

    • Entry-Exit paths are analyzed.
    • "May" analysis utilized for uninitialized data.
    • Union (U) used for combining sets of variables.
  • Backward Analysis

    • Exit-Entry paths evaluated.
    • "Must" analysis is conducted with intersection (∩).
  • Reach Definition Practice:

    • Define maintainable states to evaluate definitions and uses effectively.
    • Collecting definitions that occur within a specific point in the code (using examples for clarity).

SSA (Static Single Assignment) and Phi Functions

  • Simple SSA:

    • A transformation in which each variable is assigned exactly once, with the use of phi functions at control flow join points.
  • Dominance-Based SSA:

    • For a variable defined in block B, place the definition in blocks that are the dominance frontiers of B.
    • Example shown illustrating the reach definition analysis of a variable across control flow branches.
  • Example of SSA Transformation:

    • Walking through detailed step-by-step assignment and transformations of variables across blocks.