1/121
Syntax and Semantics - Number, Bindings and Scope - Data Types
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Syntax
The form and structure of program code
Semantics
The meaning of program code (expressions, statements, control structures)
Sentence
A string of characters over some alphabet
Language
A set of sentences
Lexeme
The lowest level syntactic unit of a language
Token
A category of lexemes like identifiers
Generators (Grammars)
Devices that generate sentences of a language
Recognizers (Parsers)
Devices that read input strings and decide if they belong to a language
Context-Free Grammars (CFG)
Developed by Noam Chomsky in the 50s to describe syntax
A context-free grammar G is defined by the 4-tuple G = (V, Σ, R, S)
Backus-Naur Form (BNF)
Invented by John Backus in 1959 to describe Algol 58
Nonterminal Symbols
Abstractions representing classes of syntactic structures (often in angle brackets)
Terminal Symbols
Lexemes or tokens in language
V
Finite set of non-terminal characters or variable
Σ
Finite set of terminals (alphabet of the language)
R
Finite relation from V to (V ∪ Σ)* (the rewrite rules or productions)
S
Start variable/symbol (element of V)
Left-Hand Side (LHS)
The nonterminal being defined
Right-Hand Side (RHS)
String of terminals and/or nonterminals
Alternation ( | )
Indicates multiple possible right-hand sides for a nonterminal
Derivation
Repeated application of rules, starting with start symbol and ending with a sentence
Sentential form
Any string of symbols in a derivation
Sentence
A sentential form with only terminal symbols
Leftmost Derivation
Expands the leftmost nonterminal in each step
Rightmost Derivation
Expands the rightmost nonterminal in each step
Parse tree
Hierarchical representation of a derivationA
Ambiguous Grammar
A grammar that generates at least one sentential form with tow or more distinct Parse Trees
Operator Precedence
Can be expressed by proper structuring of grammar rules
Operator associativity
Can be indicated by grammar structure
Optional Parts (EBNF)
Placed in brackets [ ]
Alternative Parts (EBNF)
Placed in parentheses and separated by vertical barsR
Repetition (0 or more) (EBNF)
Placed in curly braces { }
Static Semantics
Deals with context-sensitive aspects of language that CFGs cannot describe
Examples are type checking, variable declaration before use
Attribute Grammar
A CFG with additions to carry semantic information on parse tree nodes
Attributes
Values associated with grammar symbols
Synthesized Attributes
Defined by the functions of the form S(X₀) = f(A(X₁), ..., A(Xₙ))
Inherited Attributes
Passed information down the parse tree
Intrinsic Attributes
Initial attributes on the leaves
Variable
An abstraction of a memory cell, characterized by six attributes: name, address, type, value, lifetime, and scope
Name Characteristics
Case sensitivity, length restrictions, special characters, reserved words vs keywords
Reserved word
A special word that cannot be used as a user-defined name (e.g., COBOL has 300 reserved words)
Keyword
A word that is special only in certain contexts (e.g., “real” in Fortran)
Aliases
When two variable names can be used to access the same memory location (created via pointers, reference variables, unions)
Name
The identifier used to reference the variable
Address (l-value)
The memory address with which the variable is associated
Type
Determines the range of values and set of operations defined for the variable
Value (r-value)
The contents of the location with which the variable is associated
Lifetime
The time during which a variable is bound to a particular memory cell
Scope
The range of statements over which the variable is visible
Binding
An association between an entity and an attribute
Binding time
The time at which a binding takes place
Static Binding
Occurs before runtime and remains unchanged throughout program execution
Dynamic Binding
Occurs during execution or can change during execution
Language Design Time
Binding operator symbols to operation
Language Implementation Time
Binding float type to a representation
Compile Time
Binding a variable to a type in C or Java
Runtime
Binding a non-static local variable to a memory cell
Explicit declaration
A program statement used for declaring types of variables
Implicit declaration
A default mechanism for specifying types through conventions
Type Inferencing
Determining types of variables from context (C#, Visual BASIC 9.0, ML, Haskell, F#, GO)
Static
Bound to a memory cells before execution begins and remains bound to the same memory cell
Stack-dynamic
Storage bindings created when declaration statements are elaborated
Explicit heap-dynamic
Allocated and deallocated by explicit directives during execution
Implicit heap-dynamic
Allocation and deallocation caused by assignment statements
Local Variables
Variables declared in a program unit
Nonlocal Variables
Variables visible in a unit but not declared there
Global Variables
Special category of nonlocal variables
Scope Rules
Determine how references to names are associated with variables
Static Ancestors
Enclosing static scopes to a specific scope
Static Parent
The nearest static ancestor
Blocks
Method of creating static scopes inside program units (from ALGOL 60)
Declaration Order
When and where variables must be declared (varies by language)
Global Scope
Variables declared outside function definitions (C, C++, PHP, Python)
Dynamic Scope
Based on calling sequences of program units (temporal vs spatial)
References connected by searching through the chain of subprogram calls
Data Type
A collection of data objects and a set of predefined operations on those objects
Primitive Data Type
Data types not defined in terms of other data types; often reflections of Hardware
Primitive Data Types
Integer, Floating Point, Complex, Decimal, Boolean, Charact
Integer
Represents numbers without fractions
Many languages provide multiple int types of different sizes
Floating Point
This primitive data type models real numbers
IEEE Floating-Point Standard 754
The standard for floating Points
Complex
This primitive data type consists of two floating point values: real and imaginary part
Supported by Fortran and Python
Decimal
This primitive data type stores numbers using decimal digits (4 bits per digit)
Its advantage is accuracy for financial calculations
Its disadvantage is that it wastes memory and has limited range
Boolean
Simplest primitive data type with values true and false
Often implemented as bytes than bits
Character
This primitive data type is stored as numeric codings
Common encodings are ASCII (8-bit) and Unicode (16-bit)
Unicode
This character encoding includes all characters from most natural languages
Character String Types
Values are sequences of characters
Character String Types Typical Operations
Assignment and Copying
Comparison
Catenation (Concatenation ??)
Substring Reference
Pattern Matching
Ordinal Types
Types where possible values can be associated with positive integers
Primitive Ordinal Type Examples
Integer, character, boolean types
User-Defined Ordinal Types
Enumeration Types
Subrange Types
Enumeration Types (Ordinal Type)
This ordinal type’s possible values are named constants
Subrange Types (Ordinal Type)
This ordinal type is an ordered contiguous subsequence
Array Type
Aggregate of homogeneous data elements identified by position
Array Indexing
Mapping from indices to elements
Integer Only Index Types
Java, C++, Fortran has one type of Index Type
Ordinal Types in General
Ada has one type of Index Type
Default Range Checking
Range checking of Java, C#
No Range Checking
Range checking of C, Fortran
On Demand Range Checking
Range checking of C++, Ada
Location Options
Static, stack, and heap allocations
Size Management
Static array sizes (known during compile time)
Dynamic management during runtime