1/52
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
We are mainly concerned with transactions for _____ parallel executions
task
What does ACID stand for?
atomicity, consistency, isolation, durability
what are the two types of parallel execution we talked about?
task parallel: every thread executing their own query independently. Needs concurrency control and isolation
data parallel: multiple threads working on the same query at the same time
what is isolation? what kinds of schedules maintain isolation?
each transaction executing as if it is the only transaction running on the DBMS
serial execution, serializable schedule
define “schedule”
any interleaved sequence of operations for 2+ transactions
define “serializable schedule”
a schedule whose outcome is equivalent to the outcome of one of the many possible serial executions
maintains serializability
what does “conflict serializable” mean?
a schedule is conflict-serializable if we can obtain a serial execution by swapping adjacent non-conflicting operations without changing the results
create a precedence graph to detect whether a schedule is conflict-serializable or not
conflict-serializable is an easier-to-detect subset of serializable
what is the relation between serializable, conflict-serializable, and serial?
[serializable [conflict-serializable [serial]]]
![<p>[serializable [conflict-serializable [serial]]]</p><p></p>](https://assets.knowt.com/user-attachments/0b8ba192-92a2-47b5-98d1-5cabec45913f.jpg)
how do you draw a precedence graph? what is a precedence graph used for?
each transaction is a node
if there is a conflict (any write+write or read+write from 2 threads on the same data) between t1 and t2 and t1’s operation comes before t2’s, add an edge from t1 to t2
if there is a cycle, the schedule is not conflict-serializable
how do you know if a schedule is recoverable or not?
non-recoverable:
may or may not be conflict-serializable
when a failure is introduced, e.g. a thread returns (commits) a read value that no longer exists after another thread aborts a write
recoverable:
if t2 depends on the results of t1, do not commit t2 until after t1 has committed/aborted
may or may not have cascading abort problem
what is the cascading abort problem? how do you prevent it?
cascading abort problem
when 1 abort aborts many dependent transactions, since a transaction has read an intermediate value of the aborted transaction
cascadeless schedule
rule: if t2 reads the write of t1, only begin t2’s read after t1 commits/aborts
what is a lock manager?
hash table of < RID of item we want to read/write, list of transactions who’ve acquired the lock/are waiting >
enforces s/x-lock compatibility matrix
useful for deadlock detection
What is the purpose of 2PL? What are the requirements for 2PL?
purpose: to produce conflict-serializable schedules
lock before operation using s-lock for reads, x-lock for writes
2 phases: grow phase (only request or upgrade), followed by shrink phase (only release or downgrade)
What is the purpose of strong strict 2PL? What are the requirements?
purpose: to produce conflict-serializable, recoverable schedules that avoid cascading aborts by ensuring every transaction only reads committed data
requirement: only release locks after we have aborted/committed
What are the 2 categories of ways to deal with deadlock?
use wait-for graph to detect deadlock, then abort one of them
transactions = nodes; use lock manager to figure out wait-for edges between transactions
cycle → deadlock
abort a transaction (typically newer) to break the cycle
avoid deadlocks by assigning a timestamp to each transaction and allowing 1 directional wait
“wait-die”: old waits for young. If I’m older, wait. If I’m younger, abort myself.
“wound-wait”: young waits for old. If I’m younger, wait. If I’m older, abort the lock holder.
T/F: Using 2PL, schedules may or may not be recoverable.
true. No guarantee that schedules are recoverable. Need strong strict 2PL.
T/F: for wait-die or wound-wait deadlock prevention, after we have aborted a young transaction and it retries, it should be assigned a new timestamp.
False; it should maintain the same timestamp so it’s guaranteed to make progress eventually
How might atomicity be violated in a system without logging?
System failure occurs after dirty page is pushed to disk. There’s no way to undo the partial change, when we only have the value in disk to work with.
How might durability be violated in a system without logging?
System failure occurs before dirty pages are pushed to disk.
Logging can handle what types of failures?
transaction failure (e.g. abort)
UNDO (rollback)
system crash
UNDO uncommitted transactions
REDO committed transactions
For the purposes of logging, what are the two policies we assume the BP uses?
no force policy: writes of an uncommitted transaction can be kept in memory
steal policy: writes of an uncommitted transaction can be pushed to disk
What is a CLR?
a log record of the form <ti, data identifier, before value, undoNextLSN=prevLSN field of corresponding log>
means “this is an undo operation I want to do, as a result of an abort or during the undo phase of recovery”
used only during the redo phase of recovery; ignored during undo phase
How is a blocking checkpoint performed?
pause execution of all transactions
flush all log records to disk
flush all dirty pages to disk
append the log record <checkpoint, L>, where L is the list of active transactions at the time when the checkpoint starts, the transactions that are currently paused
resume
Why is a blocking checkpoint better than no checkpoint at all? What does a blocking checkpoint signify?
all log records before the checkpoint have been (persistently) applied to the pages
start recovery process at this checkpoint, rather than the beginning of the log
What is the algorithm for transaction rollback in case of abort (non-ARIES, just normal)? Assume the transaction we want to roll back is t2.
Scan the log backward. For each log record in WAL:
if <t2, x, v1, v2>
// perform undo and append corresponding CLR
update x = v1
append CLR <t2, x, v1>
if <t2, start>
// backward scan is stopped; all changes made by t2 have been reverted
append <t2, abort>
For recovery, what is the goal of the redo phase? What is the algorithm (non-ARIES)?
replay the log history from checkpoint and redo updates of all transactions (even if the transaction ends up rolled back or uncommitted)
figure out which transactions are uncommitted (active), for the purpose of the undo phase
init L = { uncommitted transactions at this point, from the checkpoint log record }
Scan the log forward from checkpoint. For each log record:
if <ti, x, v1, v2> (normal update)
update x = v2
if <ti, x, v1> (CLR)
update x = v1
if <ti start>
add ti to L
if <ti commit> or <ti abort>
// this transaction is finished. Either committed or all effects have been reverted.
remove ti from L
// L is now the uncommitted transactions at the point of the crash. i.e. they’re either uncommitted in the middle of operation, or they’re in incomplete abortionWhat is the algorithm for the undo phase of recovery (non-ARIES)?
Scan the log backward. For each log record:
if <ti, x, v1, v2>.ti is in L
// undo
update x = v1
append CLR <ti, x, v1>
if <ti start>.ti is in L
// we’ve finished the undo for ti (we’re scanning backwards)
remove ti from L
append <ti abort>
if L is empty
// there are no transactions to be undone
stop
For ARIES, what are all the new data structures introduced?
LSN
PageLSN
PrevLSN
UndoNextLSN
ATT
DPT
checkpoint <DPT, ATT>
For ARIES, what is LSN?
log sequence number
each log record is given a unique, monotonically increasing LSN
For ARIES, what is PageLSN? What is the purpose?
purpose: skip applying a redo record if it’s already on disk
embedded in the header of each page in disk
whenever we apply an update to a page, also assign the page’s PageLSN = LSN that represents this update
all log records up to PageLSN have been applied persistently to this page already
during redo phase of recovery, any log records <= PageLSN will be skipped
For ARIES, what is PrevLSN? What is the purpose?
purpose: function as a linkedlist pointer during undo for the relevant transaction
each log record (except CLR) contains PrevLSN field that points to the previous log record of the same transaction
when performing an undo, we can follow the pointers like a linked list instead of scanning all records
For ARIES, what is UndoNextLSN? What is the purpose?
purpose: ensure that if a system crashes during a rollback/recovery (undo phase), it skips over previously processed undo operations
in CLRs only
when appending a CLR, UndoNextLSN := the prevLSN field of the log the CLR corresponds to
after encountering a CLR, we can skip to the prevLSN of the CLR’s corresponding log, because we know the corresponding log (this CLR) has already been applied earlier in the redo phase
For ARIES, what is ATT? What is the purpose?
active transaction table
table of < tid of active transaction, LastLSN of that transaction >
LastLSN = the most recent LSN involving this transaction. The entry into the linked list going backwards for this transaction
purpose: for undo, where we start from LastLSN and follow prevLSN/undoNextLSN to find all the log records we need to undo
entry is deleted when a transaction is aborted or committed
For ARIES, what is DPT? What is the purpose?
records the starting point when a page becomes dirty for the first time
hash table of < pageID of dirty page, RecentLSN >
RecentLSN = first log record that made this page dirty
entry is deleted when page is written to disk
What is a fuzzy checkpoint?
<checkpoint, ATT, DPT>
used for ARIES
What is the transaction rollback (for aborts) algorithm for ARIES? Assume the transaction we want to rollback is t2.
Look up ATT.lastLSN for t2
Undo from lastLSN {
use prevLSN pointer to skip the log records that are irrelevant to ti
after each undo, append a CLR: <ti, x, v1, undoNextLSN=(current log’s prevLSN)>
follow prevLSN back until reaching <ti start>
}
Append <ti abort>What is the recovery (for system crashes) algorithm for ARIES ?
Phase1: Analysis
Goal: read DPT and ATT from checkpoint. Go through all following log records and update DPT and ATT.
if log record for ti not in ATT
add ti to ATT
if <ti commit or abort>
remove ti from ATT
if a page is dirtied for the first time
add to DPT
(we never remove any page from DPT)
keep updating ATT.lastLSN for ti
RedoLSN = MIN of (recentLSN in DPT)
the LSN where we will start the redo phase
or RedoLSN = LSN of checkpoint if DPT is emptyPhase2: Redo
Goal: replay every action that is not already in the page on disk.
Scan the log starting at RedoLSN:
For each log record Li (Li is an LSN)=<Ti, …> that updates page Pi
if Pi is not in DPT || Li < DPT.recentLSN(Pi)
skip Li
// bc it’s not dirtied, or the log record is earlier than the first time the page was dirtied
else
read Pi from disk
if Pi.PageLSN < Li
// the last time Pi was written to disk was a time before Li
redo the log record
update Pi.PageLSN
// Note: skipping any log record means the effects of the log record have already appeared on the page
Phase3: Undo
for each <ti, lastLSN> in ATT, invoke the rollback procedure:
Look up ATT.lastLSN for ti
Undo from lastLSN {
use prevLSN pointer to skip the log records that are irrelevant to ti
after each undo, append a CLR: <ti, x, v1, undoNextLSN=(current log’s prevLSN)>
follow prevLSN back until reaching <ti start>
}
Append <ti abort>
What are 3 data partitioning methods for distributed databases?
range-based
partition vector
serves point and range queries efficiently
hash-based
hash function
serves point queries efficiently; must send range queries to all nodes
round robin
randomly distributed
must send point and range queries to all nodes
How is a global primary index built?
Partitioning function (hash or range) assigns different nodes to different partition keys
Each node builds a local index of tuples based on the partition key
How is a global secondary index built?
Partitioning function (hash or range) assigns different nodes to different values of a column A
Each node builds a table of
A value → [list of global primary index keys with this A value, out of all tuples in the entire database]
global secondary index is consulted to answer queries like get(dept=cs): use global secondary index to find all the IDs of tuples with dept=cs, then use global primary index to find the tuples with those IDs
what are the two types of skewness in distributed databases?
data skewness: some node has more data
execution skewness: data is evenly distributed, but access hotness is different
What are the 2 ways to reconfigure distributed partitioning?
Dynamic repartitioning (for range partitioning)
keep a partition table that maps ranges of a continuous keyspace to different “tablets” assigned to different nodes. Idea is to split into more tablets than there are nodes, and reassign tablets to diff nodes. Can also split a tablet/partition range.
Consistent hashing (for hash partitioning)
organize large continuous keyspace into a ring. Assign keys to the next node in the clockwise direction on the ring.
In distributed transactions, every node has 2 components. What are these and what are their roles?
TC (transaction coordinator)
where transactions come in
looks at partitioning table to determine which node’s TM to send each operation to; sends operations to correct TM to execute, collects results, returns to end user
TM (transaction manager)
executes transaction operations
How is 2PL and deadlock dealt with in distributed transactions?
2PL: each node has its own lock manager that manages locks for tuples stored locally on that node. When a transaction is done, a remote request to unlock the tuple will be sent to the lock manager on that node.
Deadlock: assign a timestamp to each transaction with node name as a tiebreaker. Do old waits for young or young waits for old.
What is the 2PC algorithm in distributed transactions?
Phase1:
TC logs <prepare T>
TC sends “prepare T” to all nodes in N={nodes who have received a write request and has log records for this transaction}
see if T can be committed, according to each Ni’s local log records
for each Ni in N
receive the “prepare T” request
locally decide if T can commit
if T can commit,
log <ready T>
return “ready”
else
log <no T>
reply “abort”
Phase2:
TC collects all “ready T” messages
if TC receives all “ready T”
TC logs <T commit>
send “commit T” to all nodes
else
TC logs <T abort>
sends “abort T” to all nodes
for each node Ni in N
if TC said “commit T”
commit
else
abort
T/F: For 2PC in distributed transactions, in case of system crash, if TC’s log has <T commit>, the transaction will be committed no matter what. If it does not have <T commit> log record, the transaction is aborted and we go into undo phase. <T commit> is the single ground truth that tells us whether we should abort or not.
T
What is the protocol if a node crashes in 2PC in distributed transactions?
Let’s say TC detects that N1 failed. TC does this:
if TC receives <ready T> from N1
continue the commit process
else
assume N1 will send “abort T” in phase1 of 2PC
abort T
After recovering, N1 does this:
look at my own local log
local <commit T>
commit
local <abort T>
abort and undo transaction
only local <ready T>
ask if TC has commit or abort log. If commit, then commit. If abort, then abort.
else if TC is unreachable, ask if N2 has commit or abort log. If commit, then commit. If abort, then abort.
else if N2 has only “ready”, wait for TC to give us the final answer, commit or abort.
no <ready T>
abort, since TC will abort too
What is the protocol if TC crashes in 2PC in distributed transactions?
for each Ni, look at my own log record
if <commit T>
commit
elif <abort T>
abort
elif nothing, no <ready T>
abort
elif only <ready T>
ask all other nodes
if all <ready T>
wait for TC to commit or not
else
abort
After recovering, TC does this:
if <commit T>
send “commit T” to all nodes
else
send “abort T” to all nodes
What is the diff between OLTP and OLAP?
OLTP: online transaction processing
short-lived transactions
only touches a couple of records
involves many writes
perform db operations on the current snapshot of the data
OLAP: online analytical processing
touch many rows
read-heavy
focused on historical data
What is column store?
store each column as separate files
can scan specific columns of all tuples
can find matching tuples for each col value using either
fixed offset math, if attributes are fixed length, or
embed the tuple ID into each “column” value
this method also allows row reordering, which is helpful for column compression
What is PAX hybrid row/column store?
divide rows into row groups
within each row group, store as column store
has benefit of col store: can scan by column, but are not able to skip columns as aggressively as pure col store
has benefit of row store: cols belonging to the same row are close to each other, so you only have to read 2 pages into memory if you’re touching 2 tuples
What are the 5 column compression techniques? Explain each one.
Run-length encoding (+ sorting and embedded tuple ID)
represent repeated values as (value being repeated = y, start from = 0, repeated number of times = 3)
Bit packing
use a special value to represent an outlier value, then store the outlier values in a separate place, and use a smaller datatype for the rest of the in-range values
Bit map
use an n-bit vector to encode the value of a column with n possible values. Useful for union (“or”) queries.
Delta encoding (+ RLE)
store the starting point and the delta
can store delta entries with RLE
Dictionary
deal with columns whose values are specific variable-length strings
store an enum basically elsewhere; map strings to e.g. int8