CMSC 141: Automata and Language Theory

Chapter 0: Review Preliminaries

Formal Definitions:

Proof: A sequence/series of formal statements containing givens and deductions. Givens are a known or established fact or situation.

Sets: A set is a collection (unordered) of objects. The objects comprising the set are called its elements or members. A set is infinite if it is not finite. The set of reals is an infinite set. (An example of an infinite set is the set of natural numbers).

PowerSet: The powerset of the set S is the set of all subsets of s. The power set is denoted by P(S).

Partition: The partition of a non-empty set A is a subset of 2^A such that the empty set is not an element and such that each element of A is in one and only one set of the partition.

Relations: A relation is a connection between the elements of two or more sets. 

  • A less than relation is a type of relation in which it is the set of all pairs of numbers such that the first number is less than the second. 

  • In these, order matters, and need not be distinct. Example: (a,b) != (b,a) and (b,b) is valid as long as each comes from a different set. 

  • Another example of a relation is the cartesian product of two sets A and B. This would be the set of all ordered pairs (a,b), with a being an element of A and b being an element of B. 

  • A binary relation on two sets A and B is a subset of A × B.

Functions: An association of each object of one kind with a unique object of another kind. 

  • Typically being able to map an element from one set to another set. 

  • A function from a set A to a set B is a binary relation R on A and B with the following property: 

    • For each a, an element of A, there is exactly one ordered pair in R with the first component a. Example: {(1,2), (2,3), (4,5)} having multiple ordered pairs that hold either 1, 2, or 4 in the left side would be invalid.

One-to-one: If for any two distinct elements a, a’ is an element of A, f(a) != f(a’).

Onto: If each element of B is the image under f of some element of A.

Bijection: If both one-to-one and onto.

Types of Proof:

Direct Proof: A sequence of statements which are either givens or deductions from previous statements. Examples are established facts, axioms, lemmas, and theorems.

Proof by CounterExample: Prove/disprove by giving one example or instance that disproves the case.

Proof by Contradiction: Assume that the opposite proposition is true then show or arrive at a contradiction.

Proof by Mathematical Induction: Contains a Base Step, Inductive Hypothesis, and Inductive Step.

Set Theory:

Proving two sets are equal: To prove that two sets are equal we have to prove that both sets are subsets of each other.


Chapter 1: Alphabets and Languages

Formal Definitions:

Symbol: An atomic unit, such as a digit, character, lower-case letter, etc. Sometimes a word or string.

Alphabet (Σ): A finite set of symbols denoted by Σ.

String over an alphabet: A finite length sequence of symbols. A string may have no symbols, denoted by ε (an empty string). U, v, w, x, y, z, and Greek letters denote strings.

Σ*: The set of all strings, including the empty string, over an alphabet Σ.

Length: The length of a string is its length as a sequence. The length of a string w is denoted by |w|. |abc| = 3 |abcd| = 4

Concatenation: The concatenation of strings x and y is the string x followed by y. Concatenation is associative. (wx)y = w(xy).

  • w = x○y if and only if |w| = |x| +|y|, w(j) = x(j) for j = 1, …, |x|, and w(|x|+j) = y(j) for j=1, …, |y|


Substring: A string v is a substring of a string w if and only if there are strings x and y such that w = xyv. Both x and y could be ε, so every string is a substring of itself.

Suffix: A substring v of some string w where if w = xv, for some x, v is a suffix.

Prefix: A substring v of some string w where if w = vy, for some y, v is a prefix.

Wi: is defined as:

w⁰ = ε, the empty string wⁱ + 1 = wⁱ ∘ w, for each i ≥ 0. w¹ = w do² = dodo

Reversal: The reversal of a string w, denoted by wᴿ, is the string “spelled backwards.” Formally, given w ∈ Σ* and |w| = 0, then wᴿ = w = ε. Formally, given w ∈ Σ* and |w| = n + 1 > 0, then w = ua for some a ∈ Σ, and wᴿ = a uᴿ.

Language: Any set of strings over Σ, that is, any subset of Σ*. Σ*, empty set, and Σ are languages in and of themselves.

Countably Infinite Set: A set is said to be countably infinite if it is equinumerous with the set of Natural numbers. We need to construct a bijection f: N → Σ*.

Denumerable Set: A set is denumerable if it has the same cardinality as the set of Natural Numbers.

Countable Set: A set is countable if it is either denumerable or finite.

Uncountable Set: A set is uncountable if it is not countable.

Kleene Star L*: is the set of all strings obtained by concatenating zero or more strings from L.

Kleene Plus (L+): L+ is the language LL* - Kleene star without the empty string.

Finite Representation of Languages: Natural for finite languages, must be in itself a string. It is the finite sequence of symbols over some alphabet Σ.

Expressions: A string of symbols that describe how languages can be built up using the operations described in the previous topics.

Regular Expressions: Describes a language exclusively by means of single symbols and empty set, combined perhaps with the symbols ∪ and *, possibly with the aid of parentheses.

Regular expressions over an alphabet Σ are all strings over the alphabet (Σ) ∪ {(,), ∅, ∪, *}, that can be obtained as follows:

  • ∅ and each member of Σ is a regular expression.

  • If α and Β are regular expressions, then so is (αΒ).

  • If α and Β are regular expressions, then so is (α ∪ Β).

  • If α is a regular expression, then so is α*.

  • Nothing is a regular expression unless it follows from (1) to (4).

Let L be a function such that if α is any regular expression, then L(α) is the language represented by α. 

  • L(∅) = ∅ and L(α) = {α} for each α ∈ Σ. 

  • If α and Β are regular expressions, then L((αΒ)) = L(α)L(Β). 

  • If α and Β are regular expressions, then L((α ∪ Β)) = L(α) ∪ L(Β). 

  • If α is a regular expression, then L(α*) = L(α)*.

Language Recognition Device: An algorithm that is specifically designed, for some language L, to answer questions of the form “Is string w a member of L?”.



robot