Database Systems Flashcards

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/41

Earn XP

Description and Tags

Flashcards based on lecture notes covering database hardware, RAID, data stores, indexing, query optimization, transactions, and normalization.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

42 Terms

New cards

Data independence

Changes can be made at one level of abstraction without impacting the others.

New cards

Mirroring

Duplicate disks used to protect against failure, requiring 2 writes to update.

New cards

Striping

Storing data across multiple disks for parallel access to speed up reads.

New cards

RAID

Reduces risk of disk failure; RAID-5 is optimal for high reads, low writes, RAID-1 is best for high writes.

New cards

Row store

Storing all data entries in one row in a block, ideal for easy updates and insertions.

New cards

Column store

Storing one column in each block, ideal for obtaining information about one attribute.

New cards

Buffer pool

A RAM cache for recently accessed data.

New cards

Log

A structure storing historical data (old/new values) used when a system fails; stores before-values for rollbacks.

New cards

Ordered index

Data sorted based on some value.

New cards

Clustered index

Sequentially ordered based on the search key (indexed value).

New cards

Unclustered index

Randomly ordered with no relation to search key. Requires seeks between each key.

New cards

B+ Tree

Leaf nodes ordered by search key (index) and chained, improving efficiency of range queries; access requires few I/Os.

New cards

Natural Join (R⋈S on R.A = S.A) - Case: one table (S) fits in RAM

Hash table with all records in S; for each record r in R, lookup r.A in table. Cost: |R| + |S|

New cards

Natural Join (R⋈S on R.A = S.A) - Case: neither table fits in RAM

Partition R and S into smaller pieces by hashing on A, matching records must be in same partition. Cost: 3|R| + 3|S|

New cards

Sort-merge join

Sort R and S on A, scan for A and merge upon match.

New cards

Block nested loops join (R⋈θS)

Compares all records, good for non-equijoins; divide outer table R into blocks, compare all records in S to each block. Cost: B * |S| + |R|

New cards

Index nested loops join (R⋈S where S has an index on A)

For all records in the outer table R, use the index on A in S to lookup r.A; ideal when indexed table is bigger than outer table

New cards

Left-deep trees

All joins occur on the left table; benefits: write output to disk, pipelining.

New cards

Query Optimization Process

Process of looking for opportunities to push selections, considering join orders, exploring left-deep plans, using dynamic programming, and comparing plans by estimating I/O cost.

New cards

Materialized views

Improves query performance for those that use view; cons: requires additional memory, must be initialized, needs maintenance.

New cards

Incremental view maintenance

Do work that’s proportional to the size of the UPDATE, not the entire table; only update what needs to be updated.

New cards

Transaction Properties

Atomicity, Consistency, Isolation, Durability.

New cards

Serial schedule

Transactions occur sequentially.

New cards

Serializable schedule

A schedule that is equivalent to when transactions occur sequentially, but may be interleaved.

New cards

Conflict equivalent schedule

When the order of conflicting operations in two schedules is the same.

New cards

Conflict serializable schedule

A conflict equivalent schedule that can be converted into a serial schedule.

New cards

Log (Transactions)

A persistent structure that stores all before/after values from transactions; important for database recovery.

New cards

Dependency graphs

Draw edge between nodes (transactions) to represent each conflict; all acyclic-directed graphs have a conflict-equivalent serializable schedule.

New cards

Locking

Shared lock requested for reads, exclusive lock requested for writes

New cards

Deadlock

Occurs when transactions are unable to be completed since they are infinitely waiting for each other’s resources; can be identified by a cycle.

New cards

2PL (Two-Phase Locking)

Characterized by gaining locks in a growing phase and releasing locks in a shrinking phase; avoids cascading aborts.

New cards

Timestamp-based Concurrency Control

Each action has a representative timestamp; goal: execute conflicting actions in timestamp order

New cards

Optimistic Concurrency Control

Assumes transactions will be able to finish execution and be validated at the end.

New cards

Recovery (ARIES) Phases

Analysis, Redo, Undo.

New cards

Update Anomaly

Causes inconsistencies across the database by updating only one instance of an attribute rather than all times it appears.

New cards

Insertion Anomaly

Missing/incorrect primary key or other identifier.

New cards

Functional dependency

A relationship between attributes that shows when one determines another.

New cards

Superkey

Attribute(s) that functionally determine all other columns.

New cards

Candidate key

Attribute(s) that can be used to identify other columns through functional dependencies.

New cards

Lossy/Lossless Decomposition

When information is lost after re-joining a decomposition; true if A ∩ B → A or A ∩ B → B.

New cards

BCNF (Boyce-Codd Normal Form)

A schema is in BCNF if for any A → B in F+ and for every table R: A → B is trivial or A is a superkey of R.

New cards

3NF (Third Normal Form)

A schema is in 3NF if for any A → B in F+ and every table R: A → B is trivial, A is a superkey of R, or each attribute of B - A is a part of some key for R.