Lecture 5

Expression Trees

Expressions as Lists

  • Previously, arithmetic expressions were implemented as lists.
  • Example: 1 2 * a 3 + ( 2 7 - x )
  • Represented as: Num * Sym a3 Id + Sym ( Sym 27 Num - Sym x Id ) Sym
  • This list representation does not reflect the hierarchical structure of the expression.
  • It can become inconvenient, especially when brackets are involved.

Expressions as Trees

  • Expressions can also be represented as trees which provides a more appropriate hierarchical structure.
  • Example: 1 2 * a 3 + ( 2 7 - x )
  • Represented as a tree with nodes: Mult, Add, Sub
  • The tree structure implies parentheses, making them explicit.

Expression Grammar

  • A redefinition of expressions using a grammar:
    • <expression> ::= <term> { ‘+’ <term> | ‘-’ <term> }
    • <term> ::= <factor> { ‘*’ <factor> | ‘/’ <factor> }
    • <factor> ::= <number> | <identifier> | ‘(’ <expression> ‘)’
  • The question is posed: Can we eliminate the need for parentheses in our expression definition?
  • The suggestion is to move the operator to the front. This changes the order of operations, implying that the first operator encountered should be evaluated first.

Prefix Expressions

  • Redefinition of expressions using prefix notation:

    • <prefixExp> ::= <number> | <identifier> | ‘+’ <prefixExp> <prefixExp> | ‘-’ <prefixExp> <prefixExp> | ‘*’ <prefixExp> <prefixExp> | ‘/’ <prefixExp> <prefixExp>
  • Prefix expressions move the operator to the front of the expression.

  • Each operator is followed by exactly two operands.

  • This notation is also known as Łukasiewicz or Polish notation.

  • Examples:

    • + 3 * t 7 is equivalent to (3+(t7))(3 + (t * 7))
    • * + 3 t 7 is equivalent to ((3+t)7)((3 + t) * 7)
  • More complex example and step-by-step evaluation:

    • / + * 4 9 / 6 2 + 7 * 2 - 8 5 ?
    • / + 36 3 + 7 * 2 3
    • / + 36 3 + 7 6
    • / 39 + 7 6
    • / 39 13
    • 3

Implementation of Expression Trees

  • The aim is to build a tree from a prefix expression.

  • This involves:

    • Starting from an existing list expression.
    • Reading in prefix expressions.
    • Repurposing the token list scanner from week 2.
    • Constructing a token tree.
    • Evaluating numerical expression trees.
  • Existing list expression code includes:

    • TokenType enum with NUMBER, IDENTIFIER, and SYMBOL.
    • TokenList class with value, type, and next attributes.
  • Originally, a token list was constructed based on the list's structure.

  • New tree expression code includes:

    • TokenType enum (imported from scanner.py) with NUMBER, IDENTIFIER, and SYMBOL.
    • TreeNode class (imported from tree.py) with item, left, and right attributes.
    • Token class with tokentype and value attributes.
  • The tree implementation is used to keep track of token information.

From Prefix to Tree

  • Goal: Transform a string into a tree.

  • Only symbols +, -, *, / are accepted as operators.

  • Each symbol should be associated with exactly two children.

  • Code snippets:

    • is_operator(char: str) -> bool: checks if a character is an operator (+,,,/+, -, *, /).
    • treenode(text: str, pos: int) -> tuple[TreeNode | None, int]: recursively constructs the expression tree.
      • It skips whitespace.
      • Matches numbers, identifiers, and symbols.
      • Creates a TreeNode with the appropriate Token.
      • If the token is an operator, it recursively calls treenode to create left and right children.
    • generate_expression_tree(text: str) -> TreeNode | None: initiates the tree generation process.
    • infix_expression_tree(tree: TreeNode) -> str: converts the expression tree back into infix notation.
  • Example:

    • Expression + 3 2 in infix notation becomes (3 + 2).
    • Expression +-*/2 3 4 5 6 in infix notation becomes (((((2 / 3) * 4) - 5) + 6)).
    • Expression /* 4 2 2 in infix notation becomes ((4 * 2) / 2).

Evaluating Numerical Expression Trees

  • The aim is to evaluate the expression trees.

  • Will only evaluate correct trees without identifiers.

  • Code snippets:

    • is_numerical_expression_tree(tree: TreeNode) -> bool: checks if the expression tree is purely numerical (no identifiers).
    • evaluate_expression_tree(tree: TreeNode) -> float: evaluates the numerical expression tree.
      • If the node is a number, it returns the number's value.
      • If the node is an operator, it recursively evaluates the left and right operands and applies the corresponding operation.
  • Examples:

    • Expression + 3 2 evaluates to 55.
    • Expression +-*/2 3 4 5 6 evaluates to 3.66666666666666653.6666666666666665.
    • Expression /* 4 2 2 evaluates to 4.04.0.

Graph Theory

What is a Graph?

  • Graphs are a generalization of trees.

  • A graph G=(V,E)G = (V, E) is a collection of nodes/vertices VV and edges EE.

  • An edge e=(v,w)e = (v, w) connects two nodes.

  • Edge ee is incident to vv or ww.

  • Edges may be directed or undirected.

  • Nodes vv and ww are neighbors and adjacent.

  • A path is a sequence (v<em>0,v</em>1,,v<em>n)(v<em>0, v</em>1, …, v<em>n) in which pairs (v</em>i,vi+1)(v</em>i, v_{i+1}) are neighbors.

  • A path is simple if every node in it is different.

  • A cycle is a path with v<em>0=v</em>nv<em>0 = v</em>n.

  • A cycle is simple if it contains at least three nodes that are all (except the first and last) different.

  • A graph is connected if for every pair v,winVv, w \\in V, there is a path from vv to ww.

  • A graph is simple if:

    • It contains no loops e=(v,v)e = (v, v).
    • It contains no duplicate edges.

Directed and Weighted Graphs

  • Edges in a graph may be weighted.
  • The weight of an edge (v,w)(v, w) may differ from the weight of edge (w,v)(w, v).

Seven Bridges of Königsberg

  • Leonhard Euler (1707 - 1783) is regarded as the "inventor" of graph theory.

  • The Seven Bridges of Königsberg (1735) problem: There are seven bridges crossing the river Pregel in Königsberg.

  • The question: Is there a walk that crosses each bridge exactly once?

  • Euler invented graph theory to solve this problem.

  • Pieces of land become nodes, and bridges become edges.

  • An Euler path is a path in which every edge occurs exactly once.

  • Problem analysis:

    • Suppose an Euler path exists.
    • In any node we pass through, we arrive and leave the same number of times; such a node has an even number of edges coming together.
    • The path can have at most two nodes with an odd number of edges: the starting point and the endpoint.

Graphs and Trees

  • Trees are graphs.
  • An undirected tree is a connected simple graph without cycles.
  • An undirected tree with nn nodes has n1n - 1 edges.
  • Adding another edge will create a cycle.
  • Removing any edge will make the graph unconnected.
  • Any node in the undirected tree can act as its root.
  • A spanning tree of graph G=(V,E)G = (V, E) is a tree that contains all nodes in VV and no edges that are not in EE.
  • A graph typically has multiple possible spanning trees.

Graphs as Nested Lists

  • Graphs can be represented as a matrix.
  • Nodes are numbered 0, 1, …
  • edges[i][j] represents whether edge (i,j)(i, j) exists.
  • This requires size O(n2)O(n^2), where n is the number of nodes.

Graphs as Adjacency List

  • Graphs can be represented as an adjacency list.
  • Nodes are numbered 0, 1, …
  • neighbours[i] is a list of edges incident to i.
  • This requires size O(n+m)O(n + m), where m is the number of edges.
  • Checking whether two nodes are adjacent takes up to O(n)O(n) time.

Homework

  • Read sections 4.1-4.6 of the reader.
  • Tutorial assignments.
  • Reader exercises 4.1 - 4.6.
  • Lab assignment: Expressions Part 1 and 2.