1/29
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Briefly compare Compilation v.s. Interpretation in the following aspects:
(a) Execution
(b) Speed
(c) Examples of PLs that use it
Compilation | Interpretation | |
Execution | Converts entire program into machine code (before execution) | Converts and executes program line-by-line |
Speed | Faster | Slower Dynamic |
Examples of PLs that use it: | C C++ | Python Haskell |
(a) Describe how the hybrid compilation-interpretation model works (generally)
(b) Why is this approach used?
(a) Describe how the hybrid compilation-interpretation model works.
Combines compilation & interpretation into a single execution process, which has 2 stages:
1) Compilation: Source code is translated into an intermediate form (not directly into machine code)
2) Interpretation/Execution: Intermediate form is then interpreted or Just-In-Time (JIT) compiled at runtime.
(b) Why is this approach used?
To benefit from the advantages of both forms of execution:
Compilation
Makes code more efficient to run (better performance)
IP protection
Interpretation
More flexibility
Easy of debugging
Enables platform independency (cross-platform ability)
Often optimizes intermediate code (to run more efficiently)
Portability v.s. Platform Independence
Portability:
How easily a program can run on a different systems with little or no modification, so long as the system has the appropiate interpreter/runtime environment stalled.
→ Example: C code can be compiled on Windows, Linux, etc., with changes.
Platform Independence:
The ability of a program to run on any system without being changed.
→ Example: Java bytecode runs on any system with a JVM (Java Virtual Machine).
Explain how the hybrid compilation-interpretation model specifically operates with each of following PLs:
Java
Python
C#
JavaScript
(1) Java
Source code compiled to bytecode
Bytecode executed by JVM (Java Virtual Machine)
Either interpreted line-by-line
Or further optimized using JIT (Just-In-Time) compilation
(2) Python
Source code compiled to .pyc byte code
Byte code interpreted (executed) by PVM (Python Virtual Machine)
(3) C#
Source code compiled to CIL (Common Intermediate Language)
CIL executed by the .NET CLR
Either interpreted line-by-line
Or further optimized using JIT compilation
(4) JavaScript
Source code is parsed & JIT compiled directly by browser engines (e.g. V8)
No separate bytecode step like Java or C#
Token
Smallest meaningful units of a language.
In a PL, tokens are:
Keywords
Identifiers
Numbers
Symbols
Operators
Etc.
Describe the Phases of Compilation
(1) LEXICAL ANALYSIS
Purpose: To convert source code into tokens — the basic building blocks for parsing.
Each token type (e.g. identifier, number, operator) has a “Regex” (regular expression) rule
The lexer matches input (part of source) to regex rules to generate tokens
(2) SYNTAX ANALYSIS (aka PARSING)
Purpose: Checks if tokens follow grammar rules of PL
Parser uses CFG (Context-Free Grammar) to building a parse tree
CFG = rules defining valid syntax of the specific PL
Parse tree represents structure of program’s code based on its syntax
Parse tree is checked/assessed to ensure code is syntactically correct before translating it further
(3) SEMANTIC ANALYSIS — TYPE CHECKING, SCOPE VALIDATION
Purpose: Checks if code makes logical sense beyond just syntax.
Type Checking
Verifies operations use compatible data types
Example: int x = "hello";
→ ❌ (string to int is invalid)
Scope Validation
Ensures variables and functions are declared before being used.
Also ensures they are used in the correct scope.
Example:
{
int x = 5;
}
x = 10; // ❌ x is out of scope
(4) INTERMEDIATE CODE GENERATION – Produces IR/AST/Bytecode
Purpose: Convert code into a lower-level form that is (a) easier to translate and optimize into machine code (b) independent of hardware.
Produces IR, AST, and Bytecode:
IR (Intermediate Representation)
Abstract, platform-independent code
Easier for optimization
Example: Three-address code → t1 = a + b
AST (Abstract Syntax Tree)
Logical structure of code
Simplifies parse tree by removing unnecessary details
Bytecode
Compact, executable IR (e.g., Java bytecode)
Designed to run on a virtual machine
(5) OPTIMIZATION – Refines IR for performance
Purpose: To make final code more efficient before machine code is generated.
Improves intermediate representation of program by making it faster and use less memory (without changing what program does):
Constant folding
Dead code elimination
Loop optimization
(6) CODE GENERATION — Ouputs target (machine) code
Tokenize the following: int x = 5;
[“int”, “x”, “=”, “5”, “;”]
Equivalent to grammar in programming languages is _________.
Equivalent to grammar in programming languages is syntax.
Compiler
Translates high-level source code to machine code (normal compilation) or bytecode (hybrid compilation- interpretation model) before executing program.
Interpreter
Translates and executes code line-by-line at runtime.
Bytecode
Intermediate portable format of a program
Misspelled Identifier
Which phase of compilation would this type of error be detected?
Lexical
Invalid Token
Which phase of compilation would this type of error be detected?
Lexical
Infinite Loop and Unreachable Code are logical errors.
Which phase of compilation would these type of errors be detected?
Phase 6: Code Generation (Runtime/Logical Error)
Missing semicolon
Which phase of compilation would this type of error be detected?
Syntax
Type Mismatch
Which phase of compilation would this type of error be detected?
Semantic
Undeclared Variable
Which phase of compilation would this type of error be detected?
Semantic
Unbalanced Variable
(a) Describe what this is.
(b) Which phase of compilation would this type of error be detected?
Unbalanced Variable
(a) Describe what this is.
Syntax error where parentheses are either missing or mismatched in a program.
//Example
int x = (2 + 3 * (4 - 1;
❌ This code has unbalanced parentheses because there's one opening ( without a corresponding closing ).
(b) Which phase of compilation would this type of error be detected?
Phase 2: Syntax Analysis
Name the 5 Token Types/Categories. Provide an example of each.
Keyword (e.g. int
, if
)
Identifiers (e.g. x, myVar)
Operators (e.g. +
, =
, *
)
Consultant/Literal (e.g. 42
, '‘a'
)
Symbol (e.g. ;
, (
, )
Parse Tree v.s. Abstract Syntax Tree
Feature | Parse Tree | AST (Abstract Syntax Tree) |
---|---|---|
Grammar Rules | ✅ Includes every detail | ❌ Omits unnecessary syntax tokens (e.g. parentheses) |
Size | Larger (more nodes) | Smaller (simplified structure) |
Focus | Syntax (based on grammar) | Semantics (actual meaning of code) |
Use in compilation | Built during parsing | Used in later stages (semantic analysis, code gen) |
Nodes | Grammar rules / Component of code | Rep. mathematical & logical operators |
Children (of nodes) | Parts that make up grammar rule/component of code (e.g. operands, grammar symbols) | Operands |
Error Checking: Front-end v.s Back-end
Error Checking
Front-end: Lexical, syntax, semantic errors
Back-end: Optimization, code generation errors
Describe or Sketch: Compilation Flow
Source Program
↓
Lexical Analyzer → Tokens
↓
Syntax Analyzer → Parse Tree
↓
Semantic Analyzer → AST / IR
↓
Optimizer → Optimized IR
↓
Code Generator → Target Code (Assembly/Machine)
\d+
What does this Regex mean?
number w/ 1+ digits
[A-Za-z]
What does this Regex mean?
single letter (uppercase or lowercase)
^Hello
What does this Regex mean?
Line starts with “Hello”
.*
What does this Regex mean?
Any characters (0 or more) except for the newline character
‘(cat
What does this Regex mean?
dog)’
[0-9] {5}
What does this Regex mean?
Exactly 5 digits
a?
What does this Regex mean?
One character
[^\d]
What does this Regex mean?
Any character that is not a digit