Database Management Systems – Lecture 1 Notes: Transactions & Concurrency

Course Logistics

  • Lecture structure: Lecture 2 + Seminar 1 + Laboratory 1 every week.

  • Grading breakdown (must separately obtain 5\ge 5 in written & practical):

    • Written exam (W) – 50 %

    • Practical exam (P) – 25 %

    • Laboratory activity (L) – 25 %

  • Attendance requirements to be admitted to exam:

    • 6\ge 6 laboratory attendances

    • 5\ge 5 seminar attendances

    • Regulation source: https://www.cs.ubbcluj.ro/wp-content/uploads/Hotarare-CDI-29.04.2020.pdf

  • Instructor: Sabina S., e-mail : sabina@cs.ubbcluj.ro – course page : https://www.cs.ubbcluj.ro/~sabina

DBMS High-Level Architecture

  • External components:

    • SQL Command interface → Parser → Optimizer → Plan Executor (Operator Evaluator).

  • Storage & access layer:

    • Buffer Manager, Files & Access Methods, Disk-Space Manager.

  • Reliability / concurrency layer:

    • Recovery Manager, Transaction Manager, Lock Manager.

  • Persistent data structures:

    • Data files, Index files, System catalog.

  • Altogether form the Query Evaluation Engine + Concurrency-Control & Recovery subsystems of the DBMS.

Concurrency Motivation & Scenarios

Airline-Reservation scenario

  • Situation

    • Flight 10 has exactly 1 seat left: tuple flight:10, available<br>seats:1\langle\text{flight}:10,\ \text{available<br>seats}:1\rangle.

    • TA₁ serves customer C₁; TA₂ serves C₂.

  • Sequence of events (interleaved):

    1. TA₁ reads tuple (still 1 seat).

    2. TA₂ reads the same tuple (still 1 seat).

    3. C₂ instantly buys ⇒ TA₂ updates DB to 0\langle 0\rangle seats.

    4. C₁ later decides to buy, using obsolete copy ⇒ DB overwritten again with 0\langle 0\rangle, seat effectively “sold twice”.

Bank-Balance scenario

  • Accounts A<em>1,A</em>2,,A1000A<em>1, A</em>2, \ldots , A_{1000}.

  • Program P₁ computes total: <em>i=11000Balance(A</em>i)\sum<em>{i=1}^{1000} Balance(A</em>i).

  • Program P₂ transfers 500 lei from A<em>1A<em>1 to A</em>1000A</em>{1000}.

  • If P₂ executes after P₁ has already read A<em>1A<em>1 but before it reads A</em>1000A</em>{1000} ⇒ reported total becomes <em>i=11000Balance(A</em>i)+500\sum<em>{i=1}^{1000} Balance(A</em>i)+500 (inconsistent).

Why NOT simply forbid concurrency?

  • Correctness: serial run eliminates anomalies, e.g., TA₂ would see 0 seats after TA₁ commits.

  • Performance issues if we disallow interleaving:

    • While C₁ ponders, unrelated request for Flight 11 (TA₃/C₃) must wait.

    • Disk I/O is slow; overlapping CPU with I/O is crucial.

  • Goal → permit parallelism yet prevent anomalies produced by uncontrolled interleaving.

Transactions – Basic Model

  • Database is split into items (tuples, attributes, blocks, etc.).

  • Transaction T = one execution of a user program consisting of ordered operations:

    • read(T,I)read(T,I), write(T,I)write(T,I).

    • Terminates with commit(T) (make changes permanent) or abort(T) (rollback).

ACID Properties

Atomicity

  • "All-or-nothing": every operation executes or none executes.

  • DBMS keeps a log of all writes ⇒ can undo partial work after crash/explicit abort.

  • Example transfer:

AccountA ← AccountA – 100
AccountB ← AccountB + 100
  • Failure between statements ⇒ log used to restore 100 lei to AccountA.

Consistency

  • Assuming the program is correct, it moves DB from one consistent state to another.

  • Integrity constraints (ICs) defined via DDL are enforced by DBMS, but higher-level semantics (e.g., total money unchanged) is the programmer’s responsibility.

Isolation

  • Concurrent transactions behave as if executed alone; users are shielded from other sessions’ intermediate effects.

Durability

  • After DBMS reports commit, effects persist even if system crashes.

  • Enforced by Write-Ahead Logging (WAL):

    • Log record forced to disk before actual data page.

    • Guarantees both atomicity & durability.

Interleaved Execution Example

Transactions submitted together:

T1: BEGIN
     A ← A - 100
     B ← B + 100
     END
T2: BEGIN
     A ← 1.05 * A
     B ← 1.05 * B
     END
  • Acceptable results = those of serial orders: State((T1T2),DB)State((T1\,T2),DB) or State((T2T1),DB)State((T2\,T1),DB).

  • Good interleaving (equivalent to serial):

T1: A ← A-100
T2: A ← 1.05*A
T1: B ← B+100
T2: B ← 1.05*B   ✓
  • Problematic interleaving:

T1: A ← A-100
T2: A ← 1.05*A
T2: B ← 1.05*B
T1: B ← B+100   ✗
  • Final balances differ from any serial outcome ⇒ non-serializable.

Conflict Types & Classical Anomalies

  • No conflict if:

    • Two reads of same object; or

    • Reads/Writes of different objects.

  • Conflict exists when both touch same object and at least one is a write:

    • WR (read-after-write), RW (write-after-read), WW (write-after-write).

Anomaly

Conflict pattern

Illustration

Dirty read ("reading uncommitted data")

WR

T₂ reads object written by still-uncommitted T₁.

Unrepeatable read

RW

T₂ rewrites object that T₁ already read; T₁ sees different values on repeated read.

Overwriting uncommitted data

WW

T₂ overwrites value written by uncommitted T₁, possibly losing T₁’s update.

  • Example dirty-read schedule (R = read, W = write, C = commit, A = abort):

T1: R(A) W(A) .....  A
T2:           R(A) W(A) C

Scheduling Terminology

  • Schedule S: ordered list of operations (Read/Write/Commit/Abort) from a set of transactions preserving each transaction’s order internally.

  • Example schedule for transactions T₁ & T₂ manipulating variables V and sum:

read1(V) read1(sum) read2(V) write2(V) commit2
read1(V) write1(sum) commit1

Serial vs Non-Serial

  • Serial: no interleaving, e.g., run entire T₁ then entire T₂.

  • Non-Serial: interleaved actions from multiple transactions.

Serializability – Correctness Criterion

  • For a set of transactions C and schedule S ∈ Sch(C):

    • S is serializable ⇔ its outcome on a consistent DB equals the outcome of some serial schedule S₀ ∈ Sch(C).

  • Rationale:

    • If each Ti preserves consistency, and S behaves like a serial order of them, then S also preserves consistency.

  • Core objective of concurrency-control algorithms → allow high-parallelism schedules yet guarantee serializability.

Practical Implications & System Support

  • Concurrency-control managers (e.g., 2-Phase Locking, Timestamp Ordering, MVCC) enforce isolation/serializability.

  • Recovery manager + WAL enforce atomicity & durability.

  • Interplay ensures ACID despite crashes & concurrent workloads.

References (for deeper study)

  • [Ra02] Ramakrishnan & Gehrke, Database Management Systems, 3rd ed., McGraw-Hill (2002).

  • [Le99] Levene & Loizou, A Guided Tour of Relational Databases and Beyond, Springer (1999).

  • [Ga09] Garcia-Molina, Ullman & Widom, Database Systems: The Complete Book, 2nd ed. Pearson (2009).

  • [Ra02S] Slides for Ramakrishnan & Gehrke 3e – http://pages.cs.wisc.edu/~dbbook/openAccess/thirdEdition/slides/slides3ed.html

  • [Si19] Silberschatz, Korth & Sudarshan, Database System Concepts, 7th ed., McGraw-Hill (2019).

  • [Si19S] Slides for Database System Concepts 7e – http://codex.cs.yale.edu/avi/db-book/

  • [Ul11] Ullman & Widom, A First Course in Database Systems – http://infolab.stanford.edu/~ullman/fcdb.html