1/35
Flashcards covering the principles, architectures, fragmentation strategies, query processing stages, replication models, and CAP theorem trade-offs of distributed databases.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Distributed Database
A collection of multiple logically interrelated databases distributed across multiple physical locations.
Distributed Database Management System (DDBMS)
A software system that manages a distributed database while making the distribution transparent to the user.
Local Autonomy
The principle that each site in a distributed system should have independent control of its own security, locking, logging, integrity, and recovery.
No reliance on a central site
The requirement that a distributed database system should not depend on a single central site, which could become a bottleneck or a single point of failure.
Continuous operation
The goal that a distributed system should never require downtime and should provide rapid online backup and recovery facilities.
Location independence
A property where applications behave as if all data were stored locally, allowing data to be migrated between sites without modification to the application.
Fragmentation independence
A property where applications are unaware that relations are divided into fragments and stored at different physical sites.
Replication independence
A property where applications are unaware that multiple copies of data are being maintained and synchronized automatically.
Distributed Query Processing
The process where queries are broken down into component transactions to be executed at different distributed sites.
Distributed Transaction Management
The management of atomic transactions across a distributed system, handling concurrency, deadlocks, and recovery to maintain integrity.
Hardware Independence
The ability of a DDBMS to operate across a wide variety of hardware platforms and architectures.
Operating System Independence
The requirement that a distributed database system must be able to run on various different operating systems.
Network Independence
The design of a distributed database to run regardless of the communication protocols or network topology used.
DBMS Independence
The ability of a distributed database to support interoperability between different, non-alike DBMS systems at various nodes.
Horizontal Fragmentation
The division of a relation row-wise, where each fragment contains a subset of the tuples based on a defined predicate.
Primary Horizontal Fragmentation
Horizontal fragmentation performed using a predicate defined on the relation being partitioned, such as Employee_London=σLocation=′London′(Employee).
Derived Horizontal Fragmentation
Horizontal fragmentation performed using a predicate defined on another relation, often involving a join operation.
Vertical Fragmentation
The division of a relation column-wise, where each fragment contains a subset of attributes and includes the primary key for reconstruction.
Hybrid (Mixed) Fragmentation
A combination of horizontal and vertical fragmentation techniques.
Query Mapping
The first stage of query processing that converts a high-level query into an algebraic query on global relations.
Localization
The stage of query processing that transforms a global query into fragment queries based on actual data distribution.
Global Query Optimization
The process of minimizing execution cost by selecting the best overall query execution plan for a distributed system.
Local Query Optimization
Optimization performed at each individual site to execute fragment queries efficiently.
Semijoin Reduction
A technique to reduce communication costs by only moving the part of a relation that will actually be used in a join.
Mutually Consistent State
A state where all copies of a replicated data item have identical values.
Strong Mutual Consistency
A state where all copies of a data item have the same value immediately at the end of an update transaction.
Weak Mutual Consistency
Also known as eventual consistency, where all copies of a data item will eventually have the same value.
Full Replication
A scenario where every site in the distributed database stores a complete copy of the entire database.
Partial Replication
A scenario where only selected parts of the database, such as specific tables or frequently accessed rows, are replicated at different sites.
Eager Propagation
A method where changes are propagated to all replicas during the lifetime of the global transaction, aiming for strong consistency.
Lazy Propagation
A method where changes are propagated in refresh transactions after the global transaction has committed, aiming for eventual consistency.
Consistency (CAP Theorem)
The principle that each server always returns the correct response to each request.
Availability (CAP Theorem)
The principle that each request eventually receives a response.
Partition Tolerance (CAP Theorem)
The ability of a system to continue functioning even if communication is unreliable and servers are partitioned into groups.
Google Spanner
An example of a CP system that uses synchronous replication (eager propagation) for strong consistency.
Amazon DynamoDB
An example of an AP system that uses eventual consistency (lazy propagation) for high availability.