1/154
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Lexical analysis
It involves reading the source code character from left to right and organizing them into tokens.
Tokens
Lexical analysis aims to read the input code and break it into meaningful elements called _________ for a computer to understand easily.
comments and whitespace
Lexical analysis eliminates ___________ and ____________ within the source code.
Lexical analyzer
This collects characters into logical groupings and assigns internal codes to the groupings based on their structure. These logical groupings are called lexemes, and the internal codes for categories of these groupings are the tokens.
Lexemes and Tokens
A lexical analyzer collects characters into logical groupings and assigns internal codes to the groupings based on their structure. These logical groupings are called _______, and the internal codes for categories of these groupings are the _________.
regular expressions
In programming language, tokens can be described using __________ ____________.
Deterministic Finite Automaton (DFA)
A lexical analyzer uses a __________ _____________ ____________ to recognize these tokens, as they can identify regular languages.
token type
Each final state of the DFA corresponds to a specific ________ ______, allowing the analyzer to classify the input.
Automated
The process of creating a DFA from regular expressions can be ____________ to make handling token recognition easier.
Deterministic Finite Automaton (DFA)
They can identify regular languages.
Input Preprocessing
Involves cleaning up the input text and preparing it for lexical analysis.
comments, whitespaces
Input Preprocessing covers the removal of ___________, ____________, and other non-essential characters from the input text.
Tokenization
Involves the process of breaking the input text into a sequence of tokens.
regular expressions
Tokenization is done by matching the characters in the input text against a set of patterns or ___________ ___________ that define the different types of tokens.
Token Classification
The analyzer determines the type of each token. For instance, the analyzer might classify the keywords, identifiers, operators, and punctuation symbols as separate token types.
Token Validation
The analyzer checks if each token is valid based on the rules of the programming language. For instance, the analyzer might check that a variable name is a valid identifier or that an operator has the correct syntax.
Output Generation
The analyzer generates the output of the lexical analysis process, typically a list or sequence of tokens (token stream).
Syntax analysis
In Output Generation, the list of tokens can then be passed to the next stage of compilation or interpretation, which will be sent to the parser for _________ __________.
Token stream
These are the sequence of tokens.
Input Preprocessing, Tokenization, Token Classification, Token Validation, Output Generation
Lexical Analyzer process; HINT: IP T TC TV OG
Tokens
It can be individual words or symbols in a sentence, such as keywords, variable names, numbers, and punctuation.
Alphabets
All the numbers and alphabets are considered hexadecimal alphabets by language.
hexadecimal alphabets
Alphabets: All the numbers and alphabets are considered __________ ____________ by language.
Strings
The collection of different alphabets occurring continuously.
String length
It is defined by the number of characters or alphabets occurring together. For example, the length of |STIisthebest
Alphabets, Symbols, Non-token
Tokens can be specified in different sets: HINT: A S NT
Punctuation
Name: ___________ Symbols: Comma (,), Semicolon(:)
Assignment
Name: ___________ Symbols: =
Special Assignment
Name: ___________ Symbols: +=, -=, *=, /=
Comparison
Name: ___________ Symbols: ==, ≠,
Preprocessor
Name: ___________ Symbols: #
Location Specifier
Name: ___________ Symbols: &
Logical
Name: ___________ Symbols: &&, |, ||, !
Shift Operator
Name: ___________ Symbols: >>, <
Lexemes
These are the sequence of characters matched by a pattern to form the token or a sequence of input characters that comprises a single token.
string patterns
Lexemes are recognized by matching the input character string against character __________ _________, while tokens are represented as integer values.
Integer values
Tokens are represented as __________ ________.
IDENT
Example: result = oldsum – value / 50;
Lexemes: result Token: ________
ASSING_OP
Example: result = oldsum – value / 50;
Lexemes: = Token: ________
IDENT
Example: result = oldsum – value / 50;
Lexemes: oldsum Token: _________
SUB_OP
Example: result = oldsum – value / 50;
Lexemes: - Token: ________
IDENT
Example: result = oldsum – value / 50;
Lexemes: value Token: _______
DIV_OP
Example: result = oldsum – value / 50;
Lexemes: / Token: _________
INT_LIT
Example: result = oldsum – value / 50;
Lexemes: 50 Token: ________
SEMICOLON
Example: result = oldsum – value / 50;
Lexemes: ; Token: ________
single token
Everything inside a double quote ("") in print() statements is counted as a _______ ________.
Token Stream
This output is generated from lexical analysis and will be sent to the syntax analyzer for syntax analysis
Syntax analysis or parsing
It is the process of analyzing a string of symbols according to the rules of formal grammar.
Syntax
Syntax analysis or parsing checks the source code to ensure that it follows the correct ________ of the programming language it is written.
Syntax errors
These errors are identified and flagged in the syntax analysis phase and must be corrected before the program can be successfully compiled.
Syntax analysis or parsing
Syntax errors are identified and flagged in this phase and must be corrected before the program can be successfully compiled.
Syntax analysis or parsing
It is the phase after the lexical analysis in the compiling process.
Syntax analyzer or parser
This takes the token streams from a lexical analyzer and analyzes them against production rules to detect errors in the code.
production rules
A syntax analyzer or parser takes the token streams from a lexical analyzer and analyzes them against ________ ______ to detect errors in the code.
Parse tree or Abstract Syntax Tree (AST)
It is the output of the syntax analysis phase, representing the program’s structure.
Parse tree or Abstract Syntax Tree (AST)
It represents the program’s structure.
Parenthesis
A lexical analyzer can identify tokens using regular expressions and pattern rules. Still, it cannot check the syntax of a given sentence since regular expressions cannot check balancing tokens such as __________.
Context-Free Grammar (CFG)
Syntax analysis uses __________ ____ ___________ to define the syntax rules of a programming language.
production rules
Context-Free Grammar (CFG) includes __________ _________ that describe how valid strings (token streams) are formed.
Grammar
CFGs also specify the ___________ of a language to ensure that the source code adheres to the language’s syntax.
Parsing
The tokens are analyzed based on the grammar rules of the programming language. A parse tree or AST is constructed to represent the hierarchical structure of the program.
Error Handling
If the input program contains syntax errors, the syntax analyzer detects and flags them to the user, indicating where the error occurred.
Symbol Table Creation
The syntax analyzer creates this, a data structure that stores information about the identifiers used in the program, such as type, scope, and location.
Parsing, Error Handling, Symbol Table Creation
The parser accomplishes the following steps: HINT: P EH STC
Derivation
It is the process of applying the rules of Context-Free Grammar to generate a sequence of tokens to form a valid structure.
Derivation
Simply, it is a sequence of production rules to get the input string for the parser.
Non-terminal
There are two (2) decisions for some sentential form of input during parsing: Deciding on the __________ to be replaced
production rule
There are two (2) decisions for some sentential form of input during parsing: Deciding the ___________ _________ by which the non-terminal will be replaced
Left-most derivation
It is called ________ ________ _____________ if the sentential form of an input is scanned and replaced from left to right.
Left-sentential form
The left-most derived sentential form is called the _____ __________ ____.
Right-most derivation
It is called the ________ ____________ _________ if the input is scanned and replaced with production rules.
Right-sentential form
The right-most derived sentential form is called the _______ ___________ _______.
Parse Tree
It is the graphical representation of a derivation. It is convenient to see how strings are derived from the start symbol, which becomes the root of the parse tree.
Terminals
In a parse tree, all leaf nodes are ____________.
Non-terminals
In a parse tree, all interior nodes are _____________.
In-order traversal
This traversal gives the original input string in a parse tree. A parse tree represents associativity and precedence of operators.
Deepest sub-tree
This is traversed first, allowing the operator in that sub-tree to get precedence over the operator in the parent nodes.
Ambiguity
The grammar is ___________ if it has more than one parse tree, either left or right derivation, for at least one (1) string.
Associativity and Precedence
No method can detect and remove ambiguity automatically. Still, it can be removed by either re-writing the whole grammar without ambiguity or by setting and following _____________ and __________________ constraints. These methods decrease the chances of ambiguity in a language or its grammar.
Associativity
When an operand has operators on both sides, the side on which the operator takes this operand is decided by the association of those operators. The operand will be taken by the left operator if the operation is left associative, and the right operator will take the operand if the operation is right-associative.
Left-associative
Its operations include Addition, Multiplication, Subtraction, and Division.
For example: id op id op id will be evaluated as (id op id) op id
Simply, 2 + 3 + 4 will be evaluated as (2 + 3) + 4
Right-associative
Its operations such as exponentiation will have the following evaluation in the same expression as above. For example: id op id op id will be evaluated as id op (id op id) Simply, 2 ^ 3 ^ 4 will be evaluated as 2 ^ (3 ^ 4)
Precedence
When two (2) different operators share a common operand, the precedence of operators decides which will take the operand.
Hierarchy of priorities
In Python, some operators are performed before others. It is called the ____________ __ ______________.
1
Priority: _____ Operator: ** or Exponentiation Operator
2
Priority: _____ Operator: Unary + and –, where unary operators located next to the right of the power operator bind more strongly. For example, 4 ** -1 equals 0.25.
3
Priority: _____ Operator: *, /, //, %
4
Priority: _____ Operator: Binary + (Addition) and – (Subtraction)
Name
It is a string of characters used to identify some entity in a program.
Letters, digits, and underscore (_)
Names in programming languages have the same form: starting with a letter followed by a string consisting of ________, ________, and ______________ characters.
Identifier
The term “__________” can be used interchangeably with “name”.
Underscore
The use of ____________ to form names was used in the 1970s-1980s but is now far less popular.
Camel notation or Camel case
In the C-based languages, the use of underscore has been replaced by __________ __________ or _________ __________, in which all of the words of a multiple-word name except the first are capitalized, as in myFirstCode. It is called “camel” as words written as names often have embedded uppercase letters, making it look like a camel’s hump.
Java and C#
The naming conventions of this programming languages have no length limit, with all characters being significant
C++
The naming conventions of this programming languages does not specify length limits, but implementors sometimes do
PHP
All the variable names in this programming language must begin with a dollar ($) sign
dollar ($)
All variable names in PHP must begin with a ________ sign
Perl
The special character at the beginning of a variable’s name in this programming language, such as $, @, or %, specifies its type
$, @, or %
In Perl, these special characters at the beginning of a variable’s name specifies its type.
Ruby
The special characters at the beginning of a variable’s name in programming language, such as @ or @@, specify that the variable is an instance or a class variable, respectively