1/35
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
transaction
a unit of program execution that accesses and possibly updates various data items
3 qualities of transactions
a transaction must see a consistent (stable) database before its execution → wait for finished transaction before starting/accepting another
during transaction execution, the database may be inconsistent
when the transaction is committed, the database must be consistent
For transactions, what are the two main issues to deal with?
failures of various kinds, such as hardware failures and system crashes (behind-the-scene workers deal w/this)
concurrent execution of multiple transactions
ACID Properties
to preserve integrity of data, the db system must ensure → atomicity, consistency, isolation, and durability
Atomicity
either all operations of the transaction are properly reflected in the db, or non are
i.e., all of it or none of it → especially after failure!
consistency
execution of a transaction in isolation preserves the consistency of the db
isolation
Although multiple transactions may execute concurrently, each transaction must be unaware of other concurrently executing transactions. Intermediate transaction results must be hidden form other concurrently executed transactions
What is another way to think of Isolation?
for every pair of transactions Ti and Tj, it appears to Ti that either Tj finished execution before Ti started, or Tj started execution after Ti finished
durability
after a transaction completes successfully, the changes it has made to the db persist, even if there are system failures
Example of a transaction and the ACID properties:
given the following fund transfer transaction → transaction to transfer $50 from account A to account B
read(A)
A := A - 50
write(A)
read(B)
B := B + 50
write(B)
how does it fulfill the atomicity requirement?
if the transaction fails after step 3 and before step 6, the system should ensure that its updates are not reflected in the db, else an inconsistency will result
Example of a transaction and the ACID properties:
given the following fund transfer transaction → transaction to transfer $50 from account A to account B
read(A)
A := A - 50
write(A)
read(B)
B := B + 50
write(B)
how does it fulfill the consistency requirement?
the sum of A and B is unchanged by the execution of the transaction
Example of a transaction and the ACID properties:
given the following fund transfer transaction → transaction to transfer $50 from account A to account B
read(A)
A := A - 50
write(A)
read(B)
B := B + 50
write(B)
how does it fulfill the isolation requirement?
if between steps 3 and 6, another transaction is allowed to access the partially updated db, it will see an inconsistent db (the sum A + B will be less than it should be) → can be ensured trivially by running transactions serially (one after the other)
however, important to note that executing multiple transactions concurrently has significant benefits
Example of a transaction and the ACID properties:
given the following fund transfer transaction → transaction to transfer $50 from account A to account B
read(A)
A := A - 50
write(A)
read(B)
B := B + 50
write(B)
how does it fulfill the durability requirement?
once the user has been notified that the transaction has completed (i.e., the transfer of the $50 has taken place), the updates to the db by the transaction must persist despite failures
What are the different transaction states (TS)?
active, partially committed, failed, aborted, committed

active TS
the initial state; the transaction stays in this state while it is executing
partially committed TS
after the final statement has been executed
failed TS
after the discovery that normal execution can no longer proceed (i.e., flight full, bad math, etc.)
aborted TS
After the transaction has been rolled back and the db restored to its state prior to the start of the transaction. Two options after it has been aborted:
restart the transaction - only if no internal logical error
kill the transaction
committed TS
after successful completion
serial execution
transactions are run one after the other → leads to CPU sitting idle
concurrent execution
transactions can run at the same time → faster and no idle BUT need to make sure there are no conflict
conflicts occurs from working on the same item or data
Advantages of running multiple transactions concurrently
increased processor and disk utilization, leading to better transaction throughput
throughput = one transaction can be using the CPU while another is reading from or writing to the disk
reduced average response time for transactions
short transactions need not wait behind long ones
Concurrency control schemes
mechanisms to achieve isolation
i.e., to control the interaction among the concurrent transactions in order to prevent them from destroying the consistency of the db
Schedules
sequences that indicate the chronological order in which instructions of concurrent transactions are executed
Properties of schedules
a schedule for a set of transactions must consist of all instructions of those transactions
must preserve the order in which the instruction appear in each individual transactions
What is one advantage of serial schedule over concurrent schedule?
serial schedule ensure that no problem/inconsistencies occur
tracing
how we can determine if a concurrent schedule is good or not BUT this method is inefficient
serializability
basic assumption - each transaction preserves db consistency
serial execution of a set of transactions preserves db consistency
a concurrent schedule is serializable if it is equivalent to any serial schedule
*NOTE: serializable = good b/c no conflicts!
What do we remove when checking serializability?
we ignore operations other than read or write instructions
our simplified schedules should consist of only read and write instructions?
Conflict serializability
Instructions Ii and Ij of transaction Ti and Tj respectively, conflict if and only if there exists some item Q accessed by both Ii and Ij, and at least one of these instructions wrote Q
Ii = read(Q), Ij = read(Q) → they don’t conflict
Ii = read(Q), Ij = write(Q) → they conflict
Ii = write(Q), Ij = read(Q) → they conflict
Ii = write(Q), Ij = write(Q) → they conflict
a conflict between Ii and Ij forces a (logical) temporal order between them → NOT CONFLICT SERIALIZABLE = BAD SCHEDULE
on the other hand, if not conflict, results would remain the same even if they had been interchanged in the schedule → conflict serializable = good schedule
Conflict equivalent
If a schedule S can be transformed into a schedule S’ by a series of swaps of non-conflicting instructions
schedule S is conflict serializable if it is conflict equivalent to a serial schedule S’
precedence graph
a direct graph where the vertices are the transactions (names)
draw an arc from Ti to Tj if the two transactions conflict and Ti access the data item on which the conflict arose earlier
label the arc by the item that was accessed
Testing for serializability w/a precedence graph
a schedule is conflict serializable if and only if its precedence graph is acyclic
i.e., if the graph has a cycle, the schedule is BAD and can’t be converted into a serial schedule (not conflict serializable)
Cycle creates a conflict!!!
serializability order?
can be obtained by a topological sorting of the precedence graph (if it is acyclic)
this is a linear order consistent w/the partial order of the graph
*Bottom line: use topological sort to get serial schedule
Concurrency Control vs Serializability Tests
tests for serializability help understand why a concurrency control protocol is correct
testing a schedule for serializability after it has been executed is a little too late
Goal - to develop concurrency control protocols that will assure (before) serializability.
They will generally not examine the precedence graph as it is being created; instead, a protocol will impose a discipline that avoids non-serializable schedules
Locking
For concurrency control → Each item has a lock; if item is unlocked transaction can use it. During use, item is locked & another transaction has to wait until the item is unlocked (i.e., previous transaction is finished)