compiler

Introduction to Compilers

  • Compilers: Translators that convert programs from one language to another (e.g. high-level language to machine code).

  • Primary roles: Error detection, code optimization, and generating executable code.

Compiler Structure

Analysis and Synthesis Model

  • Compiler operates in two main phases:

    • Analysis Phase: Machine independent phase consisting of:

      • Lexical Analysis: Tokenizing source code

      • Syntax Analysis: Checking grammatical structure

      • Semantic Analysis: Ensuring semantic correctness

    • Synthesis Phase: Machine-dependent phase that creates target code from the intermediate representation.

Compiler Basics

Key Concepts

  • Interpreter: Executes code line by line; slower than compilers.

  • Preprocessor: Prepares source code for compilation (e.g. handling macros).

  • Symbols: Names used in programming; tracked in a symbol table.

Phases of a Compiler

Analysis Phases

  • Lexical Analysis:

    • Token generation, rejecting invalid characters.

    • Example of tokens: identifiers, keywords, operators.

  • Syntax Analysis:

    • Building parse trees to validate statement structure based on grammar rules.

  • Semantic Analysis:

    • Validating variable types, scope checks, and ensuring logical correctness.

Synthesis Phases

  • Code Generation: Translating intermediate code into machine code.

  • Code Optimization: Improving the efficiency of generated code without altering functionality.

Symbol Table Design

  • A data structure used to keep track of all identifiers and associated attributes (e.g. scope, type).

  • Operations on Symbol Table:

    • Insert: Adds new identifiers with attributes.

    • Lookup: Checks existence and retrieves attributes.

    • Modify: Updates information for existing identifiers.

Intermediate Code Generation

  • Generates an intermediate representation from the source code for optimization and translation to machine code.

  • Forms of Intermediate Code:

    • Three-Address Code: Simplifies expressions to a sequence of instructions (e.g. t1 = a + b).

    • Quadruples and Triples: Variants of three-address representations for storing operations.

Code Optimization Techniques

Machine Independent Optimizations

  • Constant Folding & Propagation: Reducing expressions involving constants during compile time.

  • Dead Code Elimination: Removing code that doesn’t affect output.

  • Common Sub-expression Elimination: Identifying and reusing previously computed expressions to avoid redundancy.

Machine Dependent Optimizations

  • Improve code generation based on specific target architecture, such as optimizing register usage.

Code Generation

  • Translates intermediate representation into machine-specific code.

  • Factors affecting code generation include memory management and instruction selection.

  • Techniques for deal with parameters in procedural calls and maintaining the activation record for function calls.

Conclusion

  • Compilers play a crucial role in translating high-level programming languages into executable code.

  • Understanding the architecture of compilers helps in code optimization and generation.