CPSC411 Midterm 1

studied byStudied by 2 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions
Get a hint
Hint

The Compiler Toolchain

1 / 35

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

36 Terms

1

The Compiler Toolchain

knowt flashcard image
New cards
2

Compiler Sequence

knowt flashcard image
New cards
3

Lexical Analysis

Takes in a stream of characters as input and splits it into tokens/lexemes

New cards
4

Tokens

The “logical units” of the source code

  • Typically keywords, operators, identifiers, numbers, etc

New cards
5

Parsing/Syntax Analysis

Checks if the code follows a given grammar

  • E.g “school goes a boy” is lexically correct, but not syntactically correct

  • Match input to predefined grammar

  • Outputs an abstract syntax tree

New cards
6

Semantic Analysis

Checks if the code follows the rules of the language and creates meaning

  • Scope resolution

  • Type Checking

  • Array bound checking

Completely traverses the Abstract syntax tree to do this

Outputs intermediate representation

New cards
7

Three steps of Semantic Analysis

  1. Type Checking - inspects type match in assignment, operations, functions, and method calls

  2. Flow Control Checking - checks flow control structures and are used correctly and if classes and objects are correctly assessed

  3. Label Checking - validates the use of labels and identifiers

New cards
8

Optimization

Enhances code efficiency

New cards
9

3 rules of optimization

  1. Can’t change the original programs meaning

  2. Focus on consuming fewer resources and speeding up software operation

  3. Shouldn’t significantly impact compilation time

New cards
10

4 optimization techniques

  1. Function inlining = replace function call with its body

  2. Dead code elimation = delete code if its never executed or if its result is never used

  3. Loop fusion = If several loops have the same iteration conditions, combine them

  4. Instruction combining = combining similar operation into one

New cards
11

Code generation

The compiler converts the optimized intermediate code into the machine code for the target machine

New cards
12

Byte code

Intermediate code between source and machine code

  • Result of compilation of high level source code

  • Low level

  • Not directly understandable by CPU without the use of a virtual machine first

  • Platform independent

  • Must be translated into machine code

  • Binary, hexadecimal, and macros

New cards
13

Intermediate Code

Machine independent code (so no need for unique compilers for all machines)

  • Easier to use with optimization

Formed by performing a post-order transversal of the AST to generate an IR instruction for each node (this is where most optimization occurs!)

New cards
14

2 forms of intermediate code

  1. High level: Similar to the source language. Easy to boost source code performance

  2. Low level: Close to machine code. Preferred for making machine related optimizations

New cards
15

Chompsky’s Heirarchy

  1. Regular languages - Finite automata

  2. Context free languages - Pushdown automata

  3. Context sensitive languages - Linear bounded automata

  4. Recursively enumerable languages - Turing machine

New cards
16

Context Free Grammar

A formal definition of a language defined by some rules. Each rule specifies how a single symbol can be replaced by one of a selection of sequences of atoms and symbols

New cards
17

LL Grammar

A CFG that can be evaluated by only considering the current rule and the next token in the input stream

New cards
18

3 conditions for an LL grammar

  1. No ambiguity

  2. No left recursion

  3. No common left prefix

New cards
19

Direct Left Recursion

knowt flashcard image
New cards
20

Regular Expression Precedence:

Alteration < Concatenation < Kleene’s Closure

New cards
21

Preprocessor

Prepares source code for the compiler

  • Substitute macros

  • Deal with directives starting in # if you’re in C

New cards
22

Compiler

Cleans up code, checks for errors, produces assembly

New cards
23

Assembler

Consumes assembly, produces object code

  • Object code = almost executable, does not know final memory addresses. The gaps are filled in by the linker

New cards
24

Linker

Consumes object code and libraries and compiles them into a complete, executable program

  • Selects memory locations where each piece of code and data will be loaded, then links them together by writing in missing address info

New cards
25

Retargetable computer

Contains multiple code generators, so the same IR can be emitted for a variety of microprocessors

New cards
26

Language lexicon

A “dictionary”, a complete set of available terms

New cards
27

Language Syntax

The possible ways we can put words from the syntax together

  • Syntax is well defined rules to create sentences in a given language

New cards
28

Language semantics

Indicates whether a valid sentence actually makes sense

New cards
29

Common semantic errors

  • Using an uninitialized variable

  • Added an operation immediately after a return operation in a function

  • Type mismatch

  • Accessing an index in an array outside the bounds

  • Referencing a variable out of scope

New cards
30

Why use regex?

  • Regex provides concise notation for string patterns

  • Use in lexical analysis provides small extensions to resolve ambiguities and handle errors

  • Good algorithms are known, requiring only a single pass over input and few operations per character

New cards
31

Sentence

A valid sequence of terminals in a language

New cards
32

Sentential form

A valid sequence of terminals and non terminals

New cards
33

Non-terminals

A structure that can occur in a language, but is not a literal symbol

  • Declarations, statements, and expressions

New cards
34

Ambiguity

A grammar is ambiguous if there exists 2 possible derivations for any sentence

New cards
35

virtual machine

New cards
36

interpreter

An interpreter is a program that reads and executes code written in a high-level programming language, line by line, without the need for compilation. It translates the code into machine language on the fly and executes it immediately

  • you can modify it as it runs

  • slower than compiled

New cards
robot