1/40
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Rigorous 2PL
no releasing until commit time
- avoids cascading aborts
- Simplifies recovery by holding all locks until commit
Strict 2PL
a transaction doesn't release any write locks until commit time
-Read locks can be released
-Prevents cascading aborts by holding write locks until commit
Conservative 2PL
transactions must lock all resources at the same time else no locking occurs
Basic 2PL
follow two phases for each transaction
Guarantees serializability but may cause deadlocks
cascading aborts
one abort forces another abort, then another...
wait and die
youngest kills itself and resets w/ the same timestamp
wound wait
the older kills the younger and the younger resets w the same timestamp
fan out
max possible child nodes a node can have
schedule
shows operations of various transactions as submitted for execution (i.e. chronological order of operations from multiple transactions.)
wait-for-graph
check which transaction is waiting for which other transactions
- if there is a cycle, then there is a deadlock, then kill transaction to eliminate deadlock
file : 5012
page size : 1024 bytes
record size : 50 bytes
key field size : 10 bytes
pointer size : 10 bytes
use primary index to search
# of records in index: # of pages in table
BF : ⌊1024/50⌋ = 20 records per page
total pages in table : ⌈5012 / 20⌉ = 251 pages
index has 251 records and record size = 20
Bf = ⌊1024/20⌋ = 51 records per page
index pages = ⌈251/51⌉ = 5
Searching index = log2(5) = 3 pages read
if record in/not in table = 4 pages read
how many indexes are possible on a table
1 primary index
as many secondaries as there are non-sorted key attributes
as many clustered as there are non-key attributes
timeout
after x amount of time, kill the transaction
2 phasing locking
method ensuring that transactions are always serializable
1st phase
lock growing/acquire phase: no lock can be released
2nd phase
lock release phase: no new lock can be obtained
read lock/shared lock
allows a transaction(s) to read an item but not write it
write lock/exclusive
allows a transaction (only one) to both read and write an item
dirty read
a transaction reads data that has been modified by another transaction, but not yet committed.
a table has 120,000 records. key field size equals 4 bytes. record pointer size = 6 bytes. node pointer size = 6 bytes. page size = 512 bytes. b-tree is built on the key field. How many pages will be read if the search uses the key value? assume a node has k data values o it has k record pointer and k+1 node pointers
k (key size + record pointer size) + (k + 1) (node pointer size)
k(6+4) + (k+1)(6) = 16k + 6 <= 512 = 31
fan out = 32 because 31 + 1 node pointers
# of nodes, # of data values, # of n. pointers
1 31 32
32 3132 3232
go until # of data values is 120,000
concurrency control
controlling execution of concurrent transactions
atomicity
a transaction is atomic units -> either it is executed completely or not at all
imagine transferring money between accounts: if the system debits your account but crashes before crediting the recipient's account, atomicity ensures the entire transaction is rolled back. It's like a digital version of "the check is either fully written or torn up."
consistency
a transaction (Xn), when stored in a consistent state of DB, it will result in a concurrent state; it must preserve all constraints
For example, if a rule says "account balance cannot be negative," a transaction that would create a negative balance either isn't allowed or must include other changes to maintain consistency. The transaction transforms the database from one valid state to another valid state.
consistent state of a DB
the DB satisfies all constraints
durability
after a transaction commits (successfully completes execution), it's updates become permanent
isolation
each transaction (Xn) executes without interfering with other transaction's activities.
Each transaction should operate as if it's the only one running, even when many are executing concurrently
Transaction (Xn)
program under execution, that accesses a DB
- program stored on disk is not a transaction until it runs
-some program will generate multiple transactions
schedules are equivalent if what?
Same transactions- if one transaction has r4 but the other doesn't, this breaks equivalence
same operations - if one transaction has write but the other doesn't, this breaks equivalence
Same pair of order for conflicts - Must have matching R-W, W-R, W-W.
Go through each letter and make sure the order of reads and writes are the same
operations conflict if what?
they are by different transactions
they access the same data item
at least 1 of them is a write operation
Ex: R1[a]W3[a]
serializable
there is a cycle
Schedule
the chronological order of operations from multiple transactions
Serial Schedule
Transactions run one after another with no interleaving. While this avoids all concurrency problems, it's inefficient
Serializable Schedule
NOT A SERIAL SCHEDULE - The system can interweave operations from different transactions, but the final result must be identical to what would have occurred if they had run sequentially in some order
Lost Update
Scenario: R1[x] R2[x] W1[x] W2[x] C1 C2 Explanation:
Both T1 and T2 read the same initial value of x (say, 10,000)
T1 adds 600 and writes back 10,600
T2 subtracts 300 and writes back 9,700 (overwriting T1's update)
Result: x = 9,700, but should be 10,300 (10,000 + 600 - 300)
T1's update is completely lost because T2 worked with stale data
Dirty Read
Scenario: W1[x] R2[x] ... A1 C2 Explanation:
T1 writes a value to x but hasn't committed yet
T2 reads this uncommitted ("dirty") value
T1 aborts, so its changes are rolled back
T2 has read data that never officially existed
T2 might make decisions based on phantom data
Inconsistent Analysis
T1 reads accounts x1, x2, x3 to compute an average
T2 transfers money from x2 to x4 during T1's reading
T1 then reads x4 and x5
T1's average mixes old values (x2) with new values (x4) Explanation:
T1 sees a snapshot that doesn't represent a consistent point in time
The database changed during T1's analysis
Result: T1's calculation is meaningless because it combines inconsistent states
Cascading Abort
Scenario: W1[a] R2[a] W2[b] R3[b] W3[c] A1 Explanation:
T1 writes to 'a' but hasn't committed
T2 reads 'a' from T1 (dirty read)
T2 writes to 'b' based on that read
T3 reads 'b' from T2
If T1 aborts, T2 must abort (it read dirty data)
If T2 aborts, T3 must abort (it read data from T2 that's now invalid)
One abort causes a domino effect of aborts
A.C.I.D.
Atomicity, Consistency, Isolation, and Durability - the four pillars that ensure database transactions are reliable and predictable
Transaction
Xn
Transaction Operations
READ (R1[x]): Transaction 1 reads data item x. WRITE (W1[y]): Transaction 1 writes to data item y. COMMIT (C1): Transaction 1 completes successfully, making all its changes permanent. ABORT (A1): Transaction 1 is terminated unsuccessfully, and all its changes are undone.
Two-Phase Locking (2PL)
Lock growing/acquire phase → NO lock can be revealed. Lock release phase → No new lock can be obtained." Elaboration: Two-Phase Locking is a concurrency control method that guarantees serializability. Transactions have two distinct phases:
Growing Phase: The transaction acquires all needed locks but cannot release any locks.
Shrinking Phase: The transaction releases locks but cannot acquire any new locks.