Lec 2 Compiler

Compiler Design Lecture Notes

Page 2: Phases of Compilers

  • Lexical Analysis (Scanner)

  • Syntax Analysis Phase

  • Global Optimization

  • Code Generation

  • Local Optimization

Page 3: Outline of Finite State Machine

  • Finite State Machine

  • Regular Expression

  • Implementation with Finite State Machines

Page 5: The Role of a Lexical Analyzer

  • Goal: Partition input string into substrings where the substrings are tokens

  • Example: if (i == j) Z = 0; else Z = 1;

  • The input is just a string of characters

  • The goal is to partition the input string into tokens

Page 6: Finite State Machine and Tokens

  • Before lexical analysis, cover the concepts of finite state machine and regular expression

  • Tokens are substrings of the input string

  • In English: noun, verb, adjective

  • In a programming language: Identifier, Keyword, Operator, Special Character

Page 8: Symbols and Finite State Machine

  • A symbol is an abstract entity with no meaning by itself

  • Examples of symbols: Letters, Digits, Special Characters

Page 9: Alphabet and Finite State Machine

  • An alphabet is a non-empty finite set of symbols

  • Examples of alphabets: {a, b, c}, {a, b, ..., z}, {0, 1}

Page 12: Language and Finite State Machine

  • A language is a set of strings of symbols from an alphabet

  • Examples of languages: Set of palindromes, Set of strings with a pattern

Page 13: Definition of Finite State Machine

  • A Finite State Machine is a mathematical model of computation

  • It is defined as a 5-tuple denoted by M=(Q, Σ, δ, q0, F)

  • Q: Finite or non-empty set of States

  • Σ: Input Alphabet

  • q0: Initial State

  • F: Set of Final or Accepting States

  • δ: Transition function or mapping function

Page 14: Representation of Finite State Machine

  • Finite State Machine can be represented by a transition diagram or transition table

  • Transition diagram: States represented by circles, Transition function represented by arcs

  • Transition table: Rows indicate states, Columns indicate input alphabet

Page 21: Regular Expression

  • Regular Expression is another method for specifying languages that use patterns

  • It consists of operations like union, concatenation, and Kleene star

Page 22: Operations in Regular Expressions

  • Union: Combines two sets into one

  • Concatenation: Concatenates strings from two sets

  • Kleene star: Generates zero or more concatenations of strings

Page 23: Examples of Regular Expressions

  • Examples of regular expressions and their corresponding languages

Page 24: More Examples of Regular Expressions

  • Examples of regular expressions and their corresponding languages

Page 25: Examples of Strings in Regular Expressions

  • Examples of strings that belong to the languages defined by regular expressions

Page 26: Regular Expressions Example

  • Language 1: Strings containing an odd number of zeros.

    • Regular expression: 101(0101)*

  • Language 2: Strings containing three sequential ones.

    • Regular expression: (0+1)111(0+1)

  • Language 3: Strings containing exactly three zeros.

    • Regular expression: 1010101

  • Language 4: Strings that begin with 1 and end with zero.

    • Regular expression: 1(0+1)*0

Page 27: Regular Expressions Example

  • Language L1 represents strings with an even number of ones (even parity).

  • Strings belonging to L1: a) 0101 b) 110211 c) 000 d) 010011 e) ε

    • Strings belonging to L1: a) 0101 c) 000 e) ε

Page 28: Regular Expressions Example

  • Language L2 represents strings with an equal number of a's, b's, and c's.

  • Strings belonging to L2: a) bca b) accbab c) ε d) aaa e) aabbcc

    • Strings belonging to L2: a) bca b) accbab c) ε e) aabbcc

Page 29: Regular Expressions Example

  • Strings in the language specified by the finite state machine:

    • Strings: a) abab b) bbb c) aaab d) Aaa e) ε

    • Strings in the language: a) abab b) bbb c) aaab e) ε

Page 30: Regular Expressions Example

  • Construct finite state machines for the following regular expressions:

    1. (a+b)*c

    2. (aa)*(bb)*c

Page 31: Lexical Analysis Example

  • Java source input example with word boundaries and types.

  • Output of the lexical analysis phase is a stream of tokens.

  • Each token consists of two parts: class indicating the kind of token.

Page 32: Lexical Analysis Example

  • Show word boundaries and token classes for Java input strings.

  • Lexical analysis phase does not check for proper syntax.

Page 33: Lexical Analysis Examples of Finite State Machines

  • Finite state machines for lexical analysis.

  • Machine accepts keywords: if, int, import, for, float.

Page 34: Implementation with Finite State Machines

Page 35: Implementation with Finite State Machines

  • Finite state machine can be implemented using an array.

  • Array has a row for each state and a column for each input symbol.

Page 36: Actions for Finite State Machines

  • Finite state machines can be used for more than recognizing words.

  • Actions can be associated with each state transition.

Page 37: Example of FSM with Action

  • Design a finite state machine to read numeric strings and convert them to an appropriate format.

  • Include method calls for transitions: digits(), decimals(), minus(), and expDigits().

Page 38: Example of FSM with Action

  • Finite state machine with actions for reading numeric strings.

  • Includes methods: digits(), decimals(), minus(), and expDigits().

Page 39: How to implement Lexical Tables?

  • Creation of tables in the lexical analysis phase is important for the compiler.

  • Tables can include a symbol table, table of numeric constants, string constants, and statement labels.

  • Implementation techniques: Sequential Search, Binary Search Tree, Hash Table.

Page 40: Sequential Search

  • Table organized as an array or linked list.

  • Time complexity to build a table of n words is O(n^2).

Page 41: Binary Search Tree

  • Table organized as a binary tree.

  • Time complexity to build a table of n words is O(n log n) in the best case, O(n^2) in the worst case.

Page 42: Binary Search Tree

  • Time complexity to build a table of n words is O(n log n) in the best case.

  • Time complexity could be O(n^2) in the worst case.

Page 43: Hash Table

  • Selection of a good hash function is critical for the efficiency of this method.

Page 44: THANKS for your attention