1/25
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Relational Operator
Evaluated using algorithms that read/write pages to/from disk
Selection (σ) Methods
Table scan, index scan, or using binary search on sorted file
Selection with Index
Index access is faster for selective conditions
Projection (π) Methods
Eliminate unwanted attributes; may use sorting or hashing to remove duplicates
Duplicate Elimination (π)
Use sorting or hashing to identify and remove duplicates
Join (⨝) Purpose
Combines tuples from two relations based on a join condition
Nested Loop Join
For each tuple in outer, scan entire inner; simple but slow
Block Nested Loop Join
Uses buffers to load blocks of outer and inner relation; reduces I/O
Index Nested Loop Join
Uses index on inner join key to improve performance
Sort-Merge Join
Sort both inputs on join key and then merge; good for pre-sorted data
Hash Join
Builds hash table on one relation, probes with the other; efficient for large joins
Join Output Size
Depends on join selectivity and number of matching tuples
Set Operations (∪, ∩, −)
Evaluated using sorting or hashing
Aggregation (SUM, COUNT, AVG)
Grouped using sorting or hashing on GROUP BY attributes
Group By with Sorting
Sort data by group attributes, then compute aggregates
Group By with Hashing
Use hash table keyed on group attributes; compute aggregates on the fly
Duplicate Elimination with Sorting
Sort input and scan to eliminate consecutive duplicates
Duplicate Elimination with Hashing
Hash input into buckets; eliminate duplicates within buckets
Evaluation Strategy
Goal is to minimize disk I/Os and use available memory efficiently
Materialization
Evaluate one operator fully, store results, pass to next operator
Pipelining
Pass intermediate results directly to next operator without writing to disk
Buffer Size Impact
More buffers = more efficient joins, fewer passes for sorting
Cost Factors
Disk I/O (dominant), CPU, memory usage
Selection Pushdown
Evaluate selections as early as possible to reduce intermediate results
Join Order Matters
Smaller intermediate results can drastically reduce query cost