Compilers

CSE Compiler Execution

  • Compiler Execution

Compilation

  • Compilation is a process that translates a program in one language (the source language) into an equivalent program in another language (the object or target language).
  • Important part is detection and reporting of errors
  • Source language usually high-level programming language (i.e., object-oriented language)
  • Target language is a machine language or assembly language (i.e., machine-oriented language)
  • Compilation is the link between abstract world of application development and low-level world of application execution on machines

Types of Translators

  • Assembler is also a type of translator
  • Interpreter is closely related to compiler, but takes both source program and input data
    • Translation and execution phases of the source program are the same
    • Easier implementation of programs (run-time errors immediately displayed)
    • Slower execution
    • Often requires more space

Program Compilation Process

  • The process includes preprocessing, compiling, assembling, linking/loading to generate absolute machine code.
  • Preprocessor takes a skeletal source program and generates a source program.
  • Compiler translates the source program into an assembly program.
  • Assembler converts the assembly program into relocatable machine code.
  • Link/load editor combines relocatable machine code to produce absolute machine code.

C Preprocessor

  • Preprocessing occurs before a program is compiled.
  • Includes:
    • Inclusion of other files
    • Definition of symbolic constants and macros
    • Conditional compilation of program code
    • Conditional execution of preprocessor directives
  • Format of preprocessor directives:
    • Lines begin with #
    • Only whitespace characters before directives on a line
  • Example: gcc –E helloworld.c
    • Stops after preprocessing and send output to stdout

#include Preprocessor Directive

  • Copy of a specified file included in place of directive
  • #include <filename>
    • Searches standard library for file
    • Use for standard library files
  • #include "filename"
    • Searches current directory, then standard library
    • Use for user-defined files
  • Used for:
    • Programs with multiple source files compiled together
    • Header file has common declarations and definitions (classes, structures, function prototypes)
    • #include statement in each file

#define Preprocessor Directive

  • Symbolic constants
    • When program compiled, all occurrences of symbolic constant replaced with replacement text
    • Format #define identifier replacement-text
    • Example: #define PI 3.14159
      • Everything to right of identifier replaces text
      • #define PI = 3.14159
        • Replaces "PI" with "= 3.14159"
    • Cannot redefine symbolic constants once created

#define Preprocessor Directive (Macros)

  • Macro
    • Macro without arguments treated like a symbolic constant
    • A macro with arguments has its arguments substituted for replacement text when the macro is expanded
    • Performs a text substitution – no data type checking
    • The macro #define CIRCLE_AREA( r ) ( PI * ( r ) * ( r ) ) would cause area = CIRCLE_AREA( 4 ); to become area = ( 3.14159 * ( 4 ) * ( 4 ) );

#define Preprocessor Directive (Parentheses)

  • Use parentheses
    • Without them the macro #define CIRCLE_AREA( x ) PI * ( x ) * ( x ) would cause area = CIRCLE_AREA( c + 2 ); to become area = 3.14159 * c + 2 * c + 2;
  • Multiple arguments
    • #define RECTANGLE_AREA( x, y ) ( ( x ) * ( y ) ) would cause rectArea = RECTANGLE_AREA( a + 4, b + 7 ); to become rectArea = ( ( a + 4 ) * ( b + 7 ) );

#undef Preprocessor Directive

  • Undefines a symbolic constant or macro
  • If a symbolic constant or macro has been undefined, it can later be redefined
  • Example
    c #define WIDTH 80 #define ADD( X, Y ) ((X) + (Y)) ... #undef WIDTH #undef ADD

Conditional Compilation

  • Control preprocessor directives and compilation
  • Cast expressions, sizeof, enumeration constants cannot be evaluated in preprocessor directives
  • Structure similar to ifc #if !defined( NULL ) #define NULL 0 #endif
    • Determines if symbolic constant NULL has been defined
    • If NULL is defined, defined( NULL ) evaluates to 1
    • If NULL is not defined, this function defines NULL to be 0
    • Every #if must end with #endif
    • #ifdef short for #if defined( name )
    • #ifndef short for #if !defined( name )

Conditional Compilation (cont.)

  • Other statements
    • #elif – equivalent of else if in an if structure
    • #else – equivalent of else in an if structure
  • "Comment out" code
    • Cannot use /* ... */
    • Use
      c #if 0 code commented out #endif
    • To enable code, change 0 to 1

Compiler

  • GNU Compiler Collection (gcc) is standard Linux C compiler (see man gcc for details)
  • Compiler translates C program to assembler code
  • Example gcc –S helloworld.c
    • Stop after stage of compiling proper (i.e., not passed to assembler) and outputs assembler code to helloworld.s
  • Compiler actually consists of many parts (more later)
    • Parser
    • Analyzer
    • Code optimizer
    • Code generation

Assembler

  • Assembles, but does not link, source code to generate relocateable object code
    • Object code may contain metadata such as label definitions that refer to locations within sections of code
    • May include holes (i.e., relocation entries) that are to be filled with the values of labels defined elsewhere
  • Example gcc –c helloworld.c
    • Assemble source code and save object file as helloworld.o
  • Use nm or objdump to look at object
    • nm example.o
    • objdump –t example.o

Linker

  • Combines previously compiled and assembled object code along with standard library functions to make executable file
  • Resolves references in each object file to external variables and procedures declared in other files
  • Example: gcc helloworld.o
    • Build default executable a.out
    • Use gcc –lm helloworld.o to link math library
  • Use nm to look at executable

Loader

  • Compilers, assemblers, and linkers usually produce code whose memory references are made relative to an undetermined starting location that can be anywhere in memory (relocatable machine code)
  • The loader ensures that the object programs are placed in memory in an executable form
    • Loader calculates appropriate absolute addresses for these memory locations and amends the code to use these addresses
  • Gets an address from the OS to place the program in memory
  • Changes necessary addresses (if any)
  • Places code into memory to run

Compiling and Linking

  • Only compiling (creating hello.o)
    • gcc –c hello.c
  • Only linking
    • gcc hello.o –o hello
  • Compiling and linking
    • gcc hello.c –o hello
  • To run your program
    • ./hello
    • Hello World!
  • Additional common flags to gcc
    • -g allows debugging
    • -l<library_name> linking with external libraries
  • If you leave off the -o, executable goes into the a.out file

Compiler Construction

  • Compiler Construction

What is a Compiler?

  • All computers only understand machine language
  • Therefore, high-level language instructions must be translated into machine language prior to execution

What is a Compiler? (cont.)

  • Consists of a piece of system software that translates high-level languages into machine language

Why Build Compilers?

  • Compilers provide an essential interface between applications and architectures
  • High level programming languages:
    • Increase programmer productivity
    • Better maintenance
    • Portable
  • Low level machine details:
    • Instruction selection
    • Addressing modes
    • Pipelines
    • Registers and cache
  • Compilers efficiently bridge the gap and shield the application developers from low level machine details

Assembler

  • A kind of compiler
    • One-to-one translation
    • Assembly --> Machine Language

Compiler (High-Level Language Translator)

  • A high-level language translator
    • One-to-many translation

Compiler Properties

  • Compiler must generate a correct executable
    • The input program and the output program must be equivalent, the compiler should preserve the meaning of the input program
  • Output program should run fast
    • For optimizing compilers, we expect the output program to be more efficient than the input program
  • Compiler itself should be fast
  • Compiler should provide good diagnostics for programming errors
  • Compiler should support separate compilation
  • Compiler should work well with debuggers
  • Optimizations should be consistent and predictable
  • Compile time should be proportional to code size

Compiler Goals

  • Code produced must be correct
  • Correctness is paramount; an incorrect translation defeats the purpose.
  • The compiler must ensure that the generated code accurately reflects the semantics of the source program. For example:
    • Incorrect: A = (B + C) – (D + E);
      • Where STORE B and STORE D change the values of variables B and D which the high-level language does not intend

Compiler Goals (Efficiency)

  • Code produced should be reasonably efficient and concise
  • The generated code should execute quickly and utilize memory effectively.
  • Example Compute the sum: 2x1+ 2x2+ 2x3+ 2x4+….+ 2x_{50000}
    • Vs. Optimizing compiler:
      • Which results in 49,999 less instructions

Why Study Compilers

  • Compilers embody a wide range of theoretical techniques and their application to practice
    • DFAs, DPDAs, formal languages, formal grammars, lattice theory
  • Compiler construction teaches programming and software engineering skills
  • Compiler construction involves a variety of areas
    • Theory, algorithms, systems, architecture
  • The techniques used in various parts of compiler construction are useful in a wide variety of applications
    • Many practical applications have embedded languages, commands, macros, etc.
  • Is compiler construction a solved problem?
    • No! New developments in programming languages and machine architectures (multicore machines) present new challenges

Syntax & Semantics of Language

  • A programming language must include the specification of syntax (structure) and semantics (meaning)
  • Syntax typically means the context-free syntax because of the almost universal use of context-free grammars (CFGs)
    • Example
      • a = b + c is syntactically legal
      • b + c = a is illegal
  • The semantics of a programming language are commonly divided into two classes
    • Static semantics
      • Semantics rules that can be checked at compile time
      • E.g., the type and number of a function’s argument(s)
    • Runtime semantics
      • Semantics rules that can be checked only at run time

Compilation

  • Translate high-level program (source language) into machine code (machine language)
    • Slow translation, fast execution
  • Compilation process has several phases:
    • Lexical Analysis
      • Converts characters in source program into lexical units
    • Syntax Analysis
      • Transforms lexical units into parse trees that represent the syntactic structure of program
    • Semantics Analysis
      • Generate intermediate code
    • Code Generation
      • Machine code is generated

Compilation Stages

  • For many compilers, the result is assembly, which then has to be run through an assembler
  • Lexical analysis (scanner)
  • Syntax analysis (parser)
  • Semantic analysis
  • Intermediate code generation
  • Machine-independent code improvement (optional)
  • Target code generation
  • Machine-specific code improvement (optional)

Lexical Analysis

  • Lexical analyzer or scanner
    • Recognizes the “tokens” of a program
    • The compiler scans the source code from left-to-right, character-by-character, and groups these characters into lexemes (sequence of characters that match a pattern) and outputs a sequence of tokens to the syntax analyzer
    • Each token represents a logically cohesive sequence of characters
      • Keywords if, while, …
      • Operators +, *, <=, ||, …
      • Identifiers (names of variables, arrays, procedures, classes) i, i1, j1, count, sum, …
      • Numbers 12, 3.14, 7.2E-2, …

Lexical Analysis (cont.)

  • The main functions of this phase are
    • Identify the lexical units in a source statement
    • Classify units into different lexical classes (e.g., constants, reserved words, etc.) and enter them in different tables
    • Build a descriptor (called a token) for each lexical unit (group of input characters)
    • Ignore comments in the source program
    • Detect tokens that are not a part of the language

Lexical Analysis Example

  • Program statement sum = sum + a[i];
  • Digital perspective: tab,s,u,m,blank,=,blank,s,u,m,blank,+,blank,a,[,i,],;
  • Tokenized: sum,=,sum,+,a[i],;

Lexical Analysis Process

  1. Discard blanks, tabs, etc. – look for beginning of token
  2. Put characters together
  3. Repeat step 2 until end of token
  4. Classify and save token
  5. Repeat steps 1 – 4 until end of statement
  6. Repeat steps 1 – 5 until end of source code

How Do We Specify Lexical Patterns?

  • Some patterns are easy
    • Keywords and operators
      • Specified as literal patterns: if, then, else, while, =, +, …
  • Some patterns are more complex
    • Identifiers
      • Letter followed by letters and digits
    • Numbers
      • Integer: 0 or a digit between 1 and 9 followed by digits between 0 and 9
      • Decimal: An optional sign which can be “+” or “-” followed by digit “0” or a nonzero digit followed by an arbitrary number of digits followed by a decimal point followed by an arbitrary number of digits

Language Definitions

  • An alphabet \Sigma is a finite non-empty set (of symbols)
  • A string or word over an alphabet \Sigma is a finite concatenation of symbols from \Sigma
  • The length of a string w (i.e., number of characters comprising the string) is denoted |w|
  • The empty or null string is denoted \epsilon (i.e., |\epsilon|=0)
  • The set of all strings over \Sigma is denoted \Sigma^*
  • For each n \geq 0, we define \Sigma^n = {w\in\Sigma^* \mid |w| = n}
  • For a symbol or word x, x^n denotes x concatenated with itself n times, with the convention that x^0 denotes \epsilon
  • A language over \Sigma is a set L \subseteq \Sigma^*
  • Two languages L1 and L2 over common alphabet \Sigma are equal if they are equal as sets

Lexical Patterns: Regular Expressions

  • Regular expressions (regex) describe regular languages
  • Regular Expression (over alphabet \Sigma)
    • \epsilon (empty string) is a regex denoting the set {\epsilon}
    • If a is in \Sigma, then a is a regex denoting {a}
    • If x and y are regex denoting languages L(x) and L(y) then
      • x is a regex denoting L(x)
      • x | y is a regex denoting L(x) \cup L(y)
      • xy is a regex denoting L(x)L(y)
      • x^ is a regex denoting L(x)^

Regular Expression Examples

  • All strings of 1s and 0s
    • ( 0 | 1 )^*
  • All strings of 1s and 0s beginning with a 1
    • 1 ( 0 | 1 )^*
  • All strings of 0s and 1s containing at least two consecutive 1s
    • ( 0 | 1 )^* 1 1( 0 | 1 )^*
  • All strings of alternating 0s and 1s
    • ( \epsilon | 1 ) ( 0 1 )^* ( \epsilon | 0 )

Regular Expression Extensions (Lex)

  • x+= x x* denotes L(x)+
  • x? = x | ε denotes L(x) ∪ {ε}
  • [abc] = a | b | c matches one character in the square bracket
  • a-z = a | b | c | … | z range
  • [0-9a-z] = 0 | 1 | 2 | … | 9 | a | b | c | … | z
  • [^abc] ^ means negation matches any character except a, b or c
  • . (dot) matches any character except the newline
  • . = [^\n] \n means newline, dot is equivalent to [^\n]
  • “[” matches left square bracket, metacharacters in double quotes become plain characters
  • [ matches left square bracket, metacharacter after backslash becomes plain character

Regular Definitions

  • We can define macros using regular expressions and use them in other regular expressions
    • Letter → (a|b|c| … |z|A|B|C| … |Z)
    • Digit → (0|1|2| … |9)
    • Identifier → Letter ( Letter | Digit )*
  • Important: We should be able to order these definitions so that every definition uses only the definitions defined before it (i.e., no recursion)
  • Regular definitions can be converted to basic regular expressions with macro expansion
  • In Lex, enclose definitions using curly braces
    • Identifier → {Letter} ( {Letter} | {Digit} )*

Regular Expression Examples

  • Examples:
    • Integer → (+|-)? (0| (1|2|3| … |9)(Digit *))
    • Decimal → Integer "." Digit *
    • Real → ( Integer | Decimal ) E (+|-)?Digit *
    • Complex → " (" Real , Real " ) "

Regular Expression to Scanners

  • Regular expressions are useful for specifying patterns that correspond to tokens
  • However, we also want to construct programs that recognize these patterns
  • How do we do it?
    • Use finite automata!

RegEx to DFA Example

  • Consider the problem of recognizing register names in an assembler
    • Register → R (0|1|2| … |9) (0|1|2| … |9)*
      • Allows registers of arbitrary number
      • Requires at least one digit
  • REGEX corresponds to a recognizer (or DFA)

Deterministic Finite Automata (DFA)

  • A set of states S

    • S = { s0 , s1 , s2 , se}
  • A set of input symbols (an alphabet) \Sigma

    • \Sigma = { R , 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 }
  • A transition function 𝛿 : S   → S

    • Maps (state, symbol) pairs to states
  • A start state

    • s0
  • A set of final (or accepting) states

    • Final = { s2 }

DFA Simulation

  • Start in state s0 and follow transitions on each input character
  • DFA accepts a word x iff x leaves it in a final state (s2 )
  • So,
    • “R17” takes it through s0 , s1 , s2 and accepts
    • “R” takes it through s0 , s1 and fails
    • “A” takes it straight to se
    • “R17R” takes it through s0 , s1 , s2 , se and rejects

DFA Simulation (cont.)

  • The recognizer translates directly into code
  • To change DFAs, just change the arrays
  • Takes O(|x|) time for input string x

Recognize Longest Accepted Prefix

Given an input string, this simulation algorithm returns the longest accepted prefix Given the input “R17R” , this simulation algorithm returns “R17”

Syntax Analysis

  • Syntax analyzer, or parser, groups sequences of tokens from the lexical analysis phase into phrases, each with an associated phrase type (i.e., a logical unit with respect to the rules of the source language)
  • The following operations are performed in this phase
    • Obtain tokens from lexical analyzer
    • Check whether the expression is syntactically correct
    • Report syntax errors, if any
    • Determine the statement class
      • An assignment statement, a condition (e.g., if) statement, etc.
    • Group tokens into statements
    • Construct hierarchical structures called parse trees
      • Parse trees represent the syntactic structure of the program

Grammars and Parse Trees

  • The grammar S → aSbS | bSaS | ε generates all strings of a’s and b’s with the same number of a’s as b’s
  • This grammar is ambiguous: abab has two parse trees

Grammars and Parse Trees (Unambiguous)

  • Natural languages are inherently ambiguous
    • But programming languages should not be!
  • This grammar G generates the same language
    • S → aAbS | bBaS | ε
    • A → aAbA | ε
    • B → bBaB | ε
  • G is unambiguous and has only one parse tree for every sentence in L(G)

Parse Trees

  • The statement b = a + 7 can be represented by the following parse tree
    • Parse tree illustrates the grouping of tokens into phrases
  • Successful construction of a parse tree is proof that the statement is correctly formed

Parse Trees (cont.)

  • Every construct (statement, expression, etc.) in a programming language can be represented as a parse tree
  • Terminals
    • The actual tokens of the language
  • Non-Terminals
    • Intermediate grammatical categories used to help explain and organize the language

Recursive Descent Parser

  • Simplest idea for creating a parser for a grammar is to construct a recursive descent parser
    • Built from a context-free grammar
  • Simple rules used to build the recursive program
    • Implement a function for each non-terminal
    • Generate code for each rule
      • “Call” non-terminals
      • “Match” terminals
      • Use if statement to choose between multiple rules with the same (non-terminal) symbol

Recursive Descent Parser (Pros & Cons)

  • Advantages
    • They are exceptionally simple
    • They can be constructed from recognizers simply by doing some extra work – specifically, building a parse tree
  • Disadvantages
    • They are not as fast as some other methods
    • It is difficult to provide really good error messages
    • They cannot do parses that require arbitrarily long look-aheads

Recursive Descent Parser Stack

  • One easy way to do recursive descent parsing is to have each parse method take the tokens it needs, build a parse tree, and put the parse tree on a global stack
    • Write a parse method for each non-terminal in the grammar
      • Each parse method should get the tokens it needs, and only those tokens
      • Those tokens (usually) go on the stack
      • Each parse method may call other parse methods, and expect those methods to leave their results on the stack
      • Each (successful) parse method should leave one result on the stack

Example: while Statement

  • ::= "while"
  • The parse method for a while statement:
    • Calls the Tokenizer, which returns a "while" token
    • Makes "while" into a tree node, which it puts on the stack
    • Calls the parser for , which parses a condition and puts it on the stack
      • Stack contains: "while" ("top" is on right)
    • Calls the parser for , which parses a block and puts it on the stack
      • Stack now contains: "while"
    • Pops the top three things from the stack, assembles them into a tree, and pushes this tree onto the stack

Recognizing a while Statement

  • This code assumes that keyword(String), condition(), and block() all leave their results on a stack
  • On the stack, while = 3, = 2, = 1

Parsing a while Statement

  • After recognizing a while loop, the stack looks like this:

Compiler Component Generators

  • Defines
    • Flex specification
    • Yacc specification

Desk Calculator Example

  • Flex specification for a desk calculator:
    • desk.l
    • $ flex desk.l
  • yacc specification for a desk calculator:
    • desk.y
    • $ yacc desk.y
      • This generates a file called y.tab.c, but before compiling this resulting file, you must add the following function prototypes near the top of the file:
        c int yylex(void); void yyerror(const char *);
  • Now compile this file with the flex and yacc libraries
    • $ gcc y.tab.c -ly -lfl in public directory

Desk Calculator Example (cont’d)

  • Example output

Intermediate Code Generation

  • The intermediate code produces a program in a different language, at an intermediate level between the source code and the machine code
    • This allows part of the compiler to be machine independent!
    • Intermediate languages are sometimes assembly languages
  • Generation of intermediate code offers following advantages
    • Flexibility
      • A single lexical analyzer/parser can be used to generate code for several different machines by providing separate back-ends that translate a common intermediate language to a machine-specific assembly language
    • Intermediate code is used in interpretation
      • Executed directly rather than translating it into binary code and storing it

Semantic Analysis

  • A semantic analyzer takes its input from the syntax analysis phase in the form of a parse tree and a symbol table
  • Its purpose is to determine if the input has a well-defined meaning
    • In practice, semantic analyzers are mainly concerned with type checking and type coercion based on type rules

Semantic Analysis (cont.)

  • The semantic phase has the following functions:
    • Check phrases for semantic errors (e.g., type checking)
      • In a C program, int x = 10.5; should be detected as a semantic error
    • Keeps track of types of identifiers and expressions to verify their consistent usage
    • Maintains the symbol table
      • Symbol table contains information about each identifier in a program, such as identifier type, scope of identifier, etc.

Semantic Analysis (cont. 2)

  • The semantic phase has the following functions:
    • Enforces a large number of rules using the symbol table
      • Every identifier is declared before it is used
      • No identifier is used in an inappropriate context (e.g., adding a string to an integer)
      • Function calls have a correct number and type of arguments
    • Checks certain semantic rules, called dynamic semantics, at run time
      • Array subscript expression lies within the bounds of the array
      • Variables are never used in an expression unless they have been given a value

Symbol Table

  • Built and maintained by the semantic analyzer
  • Maps each identifier to information known about it
    • Identifier’s type (e.g., int, char, float, etc.)
    • Internal structure (if any)
    • Scope (the portion of the program in which it is valid)
  • Using the symbol table, the semantic analyzer enforces a large variety of rules
  • Purpose of symbol table is to provide quick and uniform access to identifier attributes throughout compilation process

Semantics Analysis Example

  • Syntactically correct, but semantically incorrect
  • Semantic records

Code Generation

  • The code generated depends upon the architecture of the target machine
    • Knowledge of the instructions and addressing modes in a target computer is necessary for the code generation process
  • One of the important aspects of code generation is the efficient initialization of machine resources using assumptions such as:
    • Instruction types in target machines
    • Commutative properties of operators in an expression
    • Proper usage of syntax for syntax directed translation

Error Handling

  • In each phase, a compiler can encounter errors
  • On detecting an error, the compiler must
    • Report the error in a helpful way
    • Correct the error, if possible
    • Continue processing (if possible) after the error to look for further errors

Error Types

  • Syntax errors are errors in the program text
    • A lexical error is a mistake in lexeme (e.g., typing “esle” instead of “else” or missing off one of the quotes in a literal string)
    • A grammatical error is one that violates the rules of the language
  • Semantic errors are mistakes concerning the meaning of a program construct
    • Type errors occur when an operator is applied to an argument of the wrong type or to the wrong number of arguments
    • Logical errors occur when a badly conceived program is executed
    • Run-time errors are errors that can be detected only when the program is executed

Compiler Optimization

  • Compiler Optimization

Inside a Basic Compiler Front End

  • High-level language (C, C++, Java) to Low-level language (IA-64)
  • Inside an Optimizing Compiler
    • IR Improved

Control Flow Graph

  • Basic Block: A group of consecutive instructions with a single entry point and a single exit point

Code Optimization Requirements

  1. Preserve correctness
    • The speed of an incorrect program is irrelevant
  2. On average, improve performance
    • Optimized code may be worse than original if unlucky
  3. Be “worth the effort”
    • Is this example worth it

Code Optimization

  • Optimization improves programs by making them smaller or faster or both
    • They can be optimized for speed, memory usage, or program footprint
  • Goal of code optimization is to translate a program into a new version that computes the same result more efficiently
    • by taking less time, memory, and other system resources
  • Code optimization is achieved in two ways
    • Rearranging computations in a program to make them execute more efficiently
    • Eliminating redundancies in a program

Code Optimization (cont.)

  • Should not change the meaning of the program
    • Tries to improve the program, but the underlying algorithm is not affected
  • Cannot replace an inefficient algorithm with an algorithm that is more efficient
  • Also cannot fully utilize the instruction set of a particular architecture
    • Therefore, is independent of the target machine and the programming language

Optimizing Transformations

  • Two types of optimizing transformations are
    • Local transformations
      • Applied over small segments of a program
      • Provides limited benefits at a low cost
      • Scope is a basic block which is a sequential set of instructions
    • Global transformations
      • Applied over larger segments consisting of loops or function bodies