Theoretical Computer Science & Machine Learning – Exam Notes

Algorithms and Data Structures

  • IPO-Model: Input → Process → Output; data structures supply input format, algorithms implement process.

- Digital vs. Analog computers: discrete vs. continuous encoding; programmability unique to digital.

Algorithms

  • Formal definition (Webster 2021): finite, unambiguous, step-by-step procedure mapping finite inputs to outputs in finite time.
  • Correctness (specification / verification) and efficiency (time, space).
  • Euclidean GCD shown in iterative and recursive pseudocode.

- Key questions: Why? (specification) – Does it work? (verification) – How efficient? (analysis).

Data Structures

  • Arrays (1-D & 2-D), Linked Lists, Queues (FIFO), Stacks (LIFO), Heaps (min-/max-binary tree).

- Choose structure by access pattern & resource constraints.

Graphs & Trees

  • Directed graph G=(V,R,α,ω)G=(V,R,\alpha,\omega) with out-degree g+(v)g^+(v), in-degree g(v)g^-(v).
  • Undirected G=(V,E,γ)G=(V,E,\gamma).
  • Representations (space):
    • Incidence Θ(VR)\Theta(|V|\cdot |R|)
    • Adjacency matrix Θ(V2)\Theta(|V|^2)
    • Adjacency list Θ(V+R)\Theta(|V|+|R|) (directed) / Θ(V+2E)\Theta(|V|+2|E|) (undirected).
  • Trees: directed acyclic graph with single root (or unique undirected path). Binary tree g+(v)2g^+(v)\le2.

- Use cases: Euler’s Königsberg bridges, network attack graphs, BloodHound AD analysis.

Sorting & Searching

  • Selection, Insertion, Bubble O(n2)\mathcal O(n^2); Quick Sort avg O(nlogn)\mathcal O(n\log n).

- Searching: Linear O(n)\mathcal O(n), Binary O(logn)\mathcal O(\log n) (needs sorted), String matching – Naïve, KMP O(m+n)\mathcal O(m+n), Boyer-Moore.

Algorithm Analysis

  • Primitive operations → cost function f(n)f(n); asymptotic classes: \log n< n < n\log n< n^2< b^n < n!
  • Big-O upper bound, Ω\Omega lower, Θ\Theta tight.
  • Example loops: constants ignored; nested loops multiply.

Formal Languages & Automata

  • Grammar G=(V,Σ,P,S)G=(V,\Sigma,P,S) where PP are productions lrl\to r.
  • Chomsky hierarchy L<em>0L</em>1L<em>2L</em>3L<em>0\supset L</em>1\supset L<em>2\supset L</em>3
    1. Type-3 Regular (right-linear) – DFA/NFA.
    2. Type-2 Context-free – PDA.
    3. Type-1 Context-sensitive – LBA.
    4. Type-0 Recursively enumerable – Turing machine.
  • Regular language example (ab)n(ab)^n; conversion between grammar↔DFA; Rabin-Scott: NFA ≡ DFA.
  • Context-free: anbn{a^n b^n}, CNF conversion, CYK parsing, prefix function for KMP.
  • Context-sensitive: anbncn{a^n b^n c^n}, productions keep length, Turing-level power.
  • Pushdown automaton A=(S,Σ,Γ,δ,s0)A=(S,\Sigma,\Gamma,\delta,s_0) with stack symbol \bot.

Computability, Decidability & Complexity

  • Halting problem undecidable (Turing 1936).
  • Church-Turing Thesis: every effectively calculable function computable by Turing machine.
  • Decision-problem status table (Regular, CFG, CSG, Type-0): Word ✓✓✓✗, Emptiness ✓✓✗✗, Finiteness ✓✓✗✗, Equivalence ✓✗✗✗.
  • Complexity classes (relations):
    PNPPSPACEEXPTIMEEXPSPACEP \subseteq NP \subseteq PSPACE \subseteq EXPTIME \subseteq EXPSPACE; NP-hard, NP-complete = NP ∩ NP-hard.
  • Quantum
    • Qubit state ψ=α0+β1|\psi\rangle=\alpha|0\rangle+\beta|1\rangle; Bloch sphere.
    • Algorithms: Shor (factorization) O((logN)3)\mathcal O((\log N)^3) on QC; Grover search O(N)\mathcal O(\sqrt N).
    • Complexity class BQP; inclusion with P, NP unknown.

Logic

Propositional Logic

  • Connectives tables; tautology vs. contradiction vs. contingency.
  • Equivalences: De Morgan, implication elimination PQ¬PQP\Rightarrow Q \equiv \lnot P\lor Q.
  • Normal forms: CNF = AND of OR-clauses; DNF = OR of ANDs.
  • Inference rules (MP, MT, HS, DS, CD).

Predicate Logic

  • Predicates, variables, quantifiers ,\forall,\exists; translating natural language.
  • Prenex & Skolem normal forms for resolution.

Proof Calculi

  • Resolution: convert to clausal CNF, derive empty clause ◻; complete for unsat.
  • Tableau: alpha/beta expansion; closed tableau ⇒ unsat, open ⇒ model.

Program Analysis & Verification

  • Static vs. Dynamic analysis; symbolic execution; data-flow; model checking.
  • Hoare triples P  S  Q{P}\;S\;{Q}, axiomatic semantics; inference rules for assignment, sequence, if, while.
  • Semantics descriptions
    • Algebraic: equational reasoning.
    • Operational: small-/big-step transitions.
    • Denotational: mapping syntax to mathematical functions.
  • Abstract Interpretation
    • Abstract domain (e.g., intervals); sound over-approximation.
    • Framework: abstraction function, Galois connection, widening.
    • Example: interval analysis of simple program to bound variable ranges.

Artificial Intelligence & Machine Learning

Paradigms

  • Supervised (labeled) vs. Unsupervised (unlabeled clustering).
  • Regression: predict continuous yRy\in\mathbb R.
  • Classification: discrete labels; logistic function g(z)=11+ez.g(z)=\frac1{1+e^{-z}}.

Linear & Multiple Regression

  • Hypothesis hθ(x)=θTxh_\theta(x)=\theta^Tx, cost J(θ)=12m(hy)2J(\theta)=\frac1{2m}\sum (h- y)^2.
  • Gradient descent θ<em>j:=θ</em>jαJθj\theta<em>j:=\theta</em>j-\alpha\frac{\partial J}{\partial \theta_j}.

Logistic Regression / Softmax

  • Binary cost J=1m[ylogh+(1y)log(1h)]J=-\frac1m\sum[y\log h+(1-y)\log(1-h)].
  • One-vs-all for KK classes; choose argmaxkh(k)(x).\text{argmax}_k\,h^{(k)}(x).

Neural Networks

  • Neuron output a=g(θTx)a=g(\theta^Tx) with bias x0=1x_0=1.
  • Layers: input → hidden(s) → output; weight matrices Θ(l)\Theta^{(l)}.
  • Forward propagation yields h<em>Θ(x)h<em>\Theta(x); back-prop + gradient descent minimize J(Θ)=1m</em>i,ky<em>k(i)logh</em>k(i)+λ2m(Θ)2.J(\Theta)= -\frac{1}{m}\sum</em>{i,k} y<em>k^{(i)}\log h</em>k^{(i)}+\frac\lambda{2m}\sum (\Theta)^2.
  • Applications: malware e-mail multi-class detection, captcha OCR, intrusion detection.