Transaction processing
Department of Information Systems and Statistics
Focus on harnessing technology, data, and quantitative analysis.
Emphasis on Transaction Processing.
Presented by Richard Holowczak from Baruch College.
Additional resources available at https://holowczak.com/database-transaction-processing/.
Database Applications
Historical Limitations in SQL Coding:
Previously, SQL coding limitations included the ability to:
Run only one SQL statement at a time.
Support only one user at a time.
Requirements of Database Applications:
Modern applications require:
Execution of multiple SQL statements simultaneously as a group.
Allowing multiple users to share the same database.
Database Application Examples
Registering for a Course:
Steps involved include:
Check prerequisites (SQL:
SELECT).Check other restrictions such as holds (SQL:
SELECT).Check seat availability (SQL:
SELECT).Create a registration record (SQL:
INSERT).Decrement seat availability (SQL:
UPDATE).
Purchasing a Product:
Steps involved include:
Check inventory level (SQL:
SELECT).Create customer record (SQL:
INSERT).Create order record (SQL:
INSERT).Create order items record (SQL:
INSERT).Check payment method (SQL:
SELECT).Create payment record (SQL:
INSERT).Decrement product inventory (SQL:
UPDATE).
Making a Charitable Donation:
Steps involved include:
Check the fundraising campaign (SQL:
SELECT).Create donation record (SQL:
INSERT).Check payment method (SQL:
SELECT).Create payment record (SQL:
INSERT).Increment fundraising total (SQL:
UPDATE).
Purchasing a Concert Ticket:
Steps involved include:
Check seating availability (SQL:
SELECT).Create customer record (SQL:
INSERT).Create order record (SQL:
INSERT).Check payment method (SQL:
SELECT).Create payment record (SQL:
INSERT).Mark seats as unavailable (SQL:
UPDATE).
What is a Database Transaction?
A database transaction is defined as a set of read and write operations that must either fully commit or abort under the principle of "All or Nothing."
Example SQL operations for a transaction include:
INSERT INTO customer VALUES (101, 'Joe', 'Smith', …);INSERT INTO order_header VALUES(7645, '03-APR-2023',…);INSERT INTO order_items VALUES(7645, 1, 'P643', 100, 2.33);INSERT INTO order_items VALUES(7645, 2, 'P862', 50, 6.22);UPDATE inventory SET quantity = quantity – 100 WHERE partnumber='P643';UPDATE inventory SET quantity = quantity – 50 WHERE partnumber='P862';INSERT INTO payments VALUES (101, '03-APR-2023', 544.00);
These operations must take into account that more than one user may attempt to run transactions simultaneously, which raises questions about potential conflicts (WCGW: What Could Go Wrong?).
ACID Properties of Transactions
Atomic:
Every transaction is considered a logical unit of work.
Transactions must either fully commit all operations or completely abort (rollback).
Consistency Preserving:
A transaction moves the database from one consistent state to another consistent state.
Isolated:
A transaction remains unaware of any other transactions until it is committed, relating to the topic of concurrency control.
Durable:
Once committed, a transaction’s work cannot be lost even in the case of future failures; this relates to database recovery.
Consistent Database State
Database State:
Represents the set of all values in a database at a specific point in time.
Consistent State:
Refers to a state where no database constraints are violated.
Transactions must ensure that moving from State A to State B maintains consistent data integrity.
Transaction Scheduling
Definition and Importance:
Refers to two or more transactions running concurrently.
Requires careful scheduling of their individual read and write operations to avoid corrupting the database (resulting in inconsistent states).
Problems from Incorrect Scheduling:
Includes issues such as:
Lost Update
Dirty Read
Incorrect Analysis
Non-Repeatable Read (the last is not the student's responsibility).
General representation of a transaction schedule:
T1: Rx Wx RyT2: Ra Wa Rb WbSchedule: T1Rx T2Ra T1Wx T3Ri…
Example Transactions
Assumption:
Assume some inventory level defined as 'i'.
Transaction A and B Interleaved Operations:
Transaction A:
BEGINREAD iREDUCE by 2WRITE iCOMMITTransaction B:
BEGINREAD iREDUCE by 3WRITE iCOMMIT
Problems in Transaction Processing
Lost Update Problem:
Illustrated through shared inventory for Product P200, initially having 10 units.
Dirty Read Problem:
Also demonstrated with initial inventory levels for Product P200.
Incorrect Analysis Problem:
Encompasses situations where Transaction A attempts to calculate a sum of inventory while Transaction B simultaneously updates it.
Serial vs. Non-Serial Schedules
Serial Schedule:
One transaction completes all operations before another begins.
Advantages:
No transaction processing issues (completely safe).
Disadvantages:
Can be slow due to lack of concurrency.
Non-Serial Schedule:
Two or more transactions interleave their operations.
Advantages:
Increases processing speed due to effective resource usage.
Disadvantages:
Leverages complexity and leads to transaction problems such as lost updates.
Serializable Schedule:
Interleaves operations while ensuring the result remains consistent with a serial schedule.
Creating Serializable Schedules
Key Concept:
The DBMS must implement mechanisms through concurrency control to create serializable schedules.
Next Steps:
Further discussion in the next lecture focusing on concurrency control techniques.
(c) 2019-2025 R. Holowczak