LEXICAL AND SYNTAX ANALYSIS

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/41

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

42 Terms

New cards

American Standard Code Information Interchange

ASCII

New cards

Assembly
Interpreter
Compiler

Language Translator

New cards

Preprocessor
Compiler
Assembler
Linker

Language Translator - Internal Architecture

New cards

Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate Code Generator
Code Optimization
Target Code Generation

Compiler - Internal Architecture

New cards

Compilation
Pure Interpretation
Hybrid Implementation

Three Primary Approaches

New cards

involves translating high-level code into machine code using a compiler.

Compilation

New cards

Pure Interpretation

executes the original source code directly using an interpreter.

New cards

Hybrid Implementation

combines both methods. It translates high-level code into an intermediate representation, which is then interpreted.

New cards

Lexical Analysis

also known as scanning, is the first phase of processing source code. It converts the raw source code into tokens.

New cards

Syntax Analyzer

takes the generated tokens and verifies whether they follow the grammatical rules of the programming language.

New cards

Simplicity

Lexical analysis is less complex than syntax analysis. Separating them makes both processes easier to manage and maintain.

New cards

Efficiency

Since lexical analysis consumes a significant portion of compilation time, optimizing it separately improves overall performance without affecting syntax analysis.

New cards

Portability

Lexical analyzers handle platform-dependent elements, while syntax analyzers remain platform independent, making compilers more adaptable across different systems.

New cards

Lexical Analyzer

is essentially a pattern matcher that identifies tokens in a given string

Serves as front end of the Syntax Analyzer

New cards

Lexemes

are the actual sequence of characters in the source code.

New cards

Tokens

are the classification or category assigned to a lexeme.

New cards

Finite Automata

Is used to recognize patterns in a regular language, determining which token category the sequence belongs to. (e.g. identifier, integer, or operator)

New cards

State Transition Diagram

A directed graph where each state represents a step in token recognition

New cards

parsing

is the act of determining if a set of words or symbols conforms to a set of grammar rules. In

New cards

sentential form

is a string with terminal and non-terminal symbols.

New cards

leftmost derivation

expands the leftmost non-terminal in each step.

New cards

LL Parser

are a family of top-down parsers

New cards

1. Left-to-right scan of the input.

2. Leftmost derivation.

The two L's stand for:

New cards

rightmost derivation

expands the rightmost non-terminal in each step.

New cards

Bottom-up parsing

reverses this process, reducing rightmost derivations

New cards

RECURSIVE-DESCENT PARSING

consists of a collection of subprograms, many of which are recursive, and produces a parse tree in top-down order.

New cards

EBNF

is a few simple extensions to BNF which make expressing grammars more convenient

New cards

PAIRWISE DISJOINTNESS TEST

test requires that the FIRST sets of each alternative are disjoint:

New cards

Left factoring

is a technique to transform the grammar to fix this issue.

New cards

BOTTOM-UP PARSING

is a technique used in syntax analysis of programming languages, where the parser starts with the input and works backward to reconstruct the parse tree

New cards

Rightmost Derivation

It means we always expand (replace) the rightmost non-terminal first at each step

New cards

Handle

is the specific part of the input that should be reduced next in bottom-up parsing.

New cards

Phrase

is any valid sequence of symbols in the input that folows the grammar rules.

New cards

Shift

The next input token is moved onto a stack.

New cards

Reduce

The top elements of the stack are replaced using a grammar rule.

New cards

Successful completion

New cards

Error

discover syntax mistake

New cards

PDA(pushdown automaton)

a theoretical computing model used to recognize context-free languages.

New cards

canonical LR

The most well-known bottom-up parsing algorithm is the ____________, introduced by Donald Knuth. LR

New cards

GOTO Table

Determines the next state after a reduction.

New cards

ACTION Table

Defines whether to shift, reduce, accept, or report an error based on the current state and input symbol.