Multiple-Table Queries & Relational Power
- Relational DBMSs show their real strength when queries touch several tables connected by primary-foreign key pairs.
- Ex: In Pine Valley Furniture (PVF) database, Order_T.CustomerID (FK) references Customer_T.CustomerID (PK) ⇒ lets us list all orders for each customer.
- SQL makes this possible through:
- Joins (INNER, OUTER, FULL, LEFT, RIGHT, CROSS, UNION, NATURAL).
- Subqueries (nested, correlated, in WHERE/HAVING, or even in FROM as derived tables).
- Set-operations (UNION, INTERSECT, MINUS) that merge whole result sets.
- SELECT’s WHERE or FROM … JOIN clauses host the join predicates.
- One ON / WHERE condition per table-pair; joining A,B,C requires two conditions A!\text{–}B and B!\text{–}C.
Join Types & Canonical Syntax
- INNER JOIN ⇒ returns rows that satisfy the join predicate in all tables.
- ANSI form:
… FROM A INNER JOIN B ON A.pk = B.fk
. - Older style: list tables in FROM, place predicates in WHERE.
- USING (if column names identical) shortens syntax:
… JOIN B USING (CustomerID)
.
- OUTER JOINs guarantee preservation of non-matching rows:
- LEFT: all rows of 1st table + matches in 2nd; unmatched 2nd-side columns become NULL.
- RIGHT: opposite of LEFT.
- FULL: union of LEFT + RIGHT; unmatched columns on either side become NULL.
- CROSS JOIN (Cartesian) ⇒ explicit |A| \times |B| combinations; implicit when WHERE clause is missing.
- Ex: 10 orders × 15 customers ⇒ 150 rows, rarely useful.
- UNION JOIN ⇒ concatenates vertically the attributes of two tables (less widely supported).
- NATURAL keyword (optional) triggers an equi-join on all identically named columns and removes duplicates from output.
- DBMS chooses join order; optimizer weighs indexes, 1:M distribution, etc.
Equi-Join & Natural Join Deep-Dive
- Equi-join predicate is equality:
A.col = B.col
.- Example (PVF):
sql
SELECT Customer_T.CustomerID,
Order_T.CustomerID,
CustomerName,
OrderID
FROM Customer_T, Order_T
WHERE Customer_T.CustomerID = Order_T.CustomerID
ORDER BY OrderID;
- Includes both matching columns; useful for troubleshooting.
- Without WHERE ⇒ Cartesian product.
- Natural join version (duplicates removed):
SELECT Customer_T.CustomerID,
CustomerName,
OrderID
FROM Customer_T NATURAL JOIN Order_T
ON Customer_T.CustomerID = Order_T.CustomerID;
- Always qualify ambiguous names (e.g.,
CustomerID
) when both tables own the column.
Outer Joins & NULL Handling
- Illustrate missing matches: some customers placed no orders; some orders may reference defunct customers.
- LEFT OUTER example:
SELECT C.CustomerID, CustomerName, OrderID
FROM Customer_T C LEFT OUTER JOIN Order_T O
ON C.CustomerID = O.CustomerID;
- Customers without orders show OrderID = NULL.
- RIGHT OUTER mirrors above; FULL OUTER merges both row sets.
- DBMS differences: vendor proprietary syntax historically (Oracle’s (+), SQL Server’s
*=
) → ANSI syntax now preferred. - Joining >2 tables with OUTER JOINs: test carefully—results may differ among vendors due to join-reordering rules.
Complex Joins: Four-Table Invoice & Self-Join
- Business invoice (Order 1006) needs data from CustomerT, OrderT, OrderLineT, ProductT.
SELECT C.CustomerID, CustomerName, CustomerAddress, CustomerCity,
CustomerState, CustomerPostalCode,
O.OrderID, OrderDate,
OrderedQuantity,
ProductDescription,
StandardPrice,
(OrderedQuantity * ProductStandardPrice) AS LineTotal
FROM Customer_T C,
Order_T O,
OrderLine_T L,
Product_T P
WHERE O.CustomerID = C.CustomerID
AND O.OrderID = L.OrderID
AND L.ProductID = P.ProductID
AND O.OrderID = 1006;
- Self-join solves unary relationships.
- PVF
Employee_T
where EmployeeSupervisor
FK points back to EmployeeID
.
SELECT E.EmployeeID,
E.EmployeeName,
M.EmployeeName AS Manager
FROM Employee_T E, Employee_T M
WHERE E.EmployeeSupervisor = M.EmployeeID;
- Also answers “employees earning more than their manager”, “married to manager”, etc.
- Ensure columns share domain semantics.
Subqueries & Derived Tables
- Subquery lives in WHERE/HAVING and feeds outer SELECT.
- Non-correlated computed once.
- Correlated recomputed per outer row (uses outer columns).
- Basic join vs. subquery illustration (order 1008):
- Join:
sql
… FROM Customer_T, Order_T
WHERE Customer_T.CustomerID = Order_T.CustomerID
AND OrderID = 1008;
- Equivalent subquery:
sql
… FROM Customer_T
WHERE CustomerID = (SELECT CustomerID
FROM Order_T
WHERE OrderID = 1008);
- Operators for list-returning subqueries: IN, NOT IN, EXISTS, NOT EXISTS.
- Ex: customers who bought at least one product ⇒
sql
SELECT CustomerName
FROM Customer_T
WHERE CustomerID IN (SELECT DISTINCT CustomerID FROM Order_T);
- Derived Table in FROM creates a temporary result you can join/filter again.
- Average price example:
sql
SELECT ProductDescription, ProductStandardPrice, AvgPrice
FROM (SELECT AVG(ProductStandardPrice) AS AvgPrice FROM Product_T) AS A,
Product_T
WHERE ProductStandardPrice > AvgPrice;
- Allows both detail and aggregate values in same row.
Combining Result Sets
- UNION vertically appends compatible SELECT outputs (same column count & data types).
- CAST helps align types:
CAST(OrderDate AS CHAR)
. - Sample: customers with largest vs. smallest
OrderedQuantity
using two queries + UNION.
- INTERSECT ⇒ common rows; MINUS (or EXCEPT) ⇒ rows in 1st not in 2nd.
- Final ordering is done after UNION with ORDER BY across whole union.
Conditional Expressions (CASE, NULLIF, COALESCE)
- CASE provides IF-THEN-ELSE inside SQL.
- Simple example hides descriptions when
ProductLine ≠ 1
:
sql
SELECT CASE WHEN ProductLine = 1
THEN ProductDescription
ELSE '####' END AS ProductDescription
FROM Product_T;
- Predicate-based CASE supports three-valued logic.
- NULLIF(a,b) ⇒ returns NULL if a = b else a (good for divide-by-zero avoidance).
- COALESCE(a,b,c,…) ⇒ first non-NULL argument.
Crafting & Debugging Complex Queries
- Break huge statements into views or CTEs; chain them.
- Example view
TSales
totals quantity by salesperson/product; later correlated subquery picks best-seller for each salesperson.
- TOP / LIMIT / FETCH FIRST n PERCENT syntax returns first n rows after ORDER BY (SQL:2003).
- Typical error checks: Cartesian joins, wrong row counts, duplicates, bad grouping.
- Development workflow:
- Identify target attributes.
- Locate tables in ERD.
- Draft SELECT & FROM list.
- Add join predicates.
- Layer filters (WHERE, GROUP BY, HAVING, DISTINCT).
- Test with small sample data; compare to manual totals.
- Efficiency hints:
- Do not
SELECT *
; pull only necessary columns. - Minimize correlated subqueries; prefer joins or derived tables.
- Combine multiple report needs into one query to reuse scans.
- Index usage: Most DBMS engines exploit one index per table per query; equality predicates usually best.
- Keep optimizer statistics refreshed (e.g.,
ANALYZE
, UPDATE STATISTICS
). - Match literal data types to column data types to avert conversions.
- Break extremely complex predicates into several simple queries united by UNION.
- Avoid self-joins by copying table to temp when feasible; delete temp after use.
- Build temporary tables for intermediate result reuse across many queries.
- Avoid DBMS sort without index—export and sort externally if necessary.
- Use
EXPLAIN
to study cost plans; record timing baselines. - Run heavy ad-hoc reports during off-peak hours.
Views (Dynamic, Materialized)
- Base table = physically stored.
- View = virtual SELECT stored in catalog, rebuilt on each reference; uses no storage, always current.
- Materialized view = result physically stored, refreshed (time-based or event-based); speeds distributed or aggregate queries.
- Benefits of views:
- Hide complexity; enforce security by projecting/renaming columns.
- Provide backward-compatible logical schemas (physical independence).
- Improve programmer productivity.
- CREATE VIEW example – invoice projection:
CREATE VIEW Invoice_V AS
SELECT C.CustomerID, CustomerAddress,
O.OrderID, P.ProductID,
ProductStandardPrice, OrderedQuantity
FROM Customer_T C, Order_T O, OrderLine_T L, Product_T P
WHERE C.CustomerID = O.CustomerID
AND O.OrderID = L.OrderID
AND P.ProductID = L.ProductID;
- Querying the view same as table; can layer further views (e.g.,
OrderTotals_V
). - WITH CHECK OPTION prevents updates through the view that would violate its WHERE condition.
- System tables (Oracle
DBA_VIEWS
, SQL Server sys.views
, Information_Schema views) store definition text.
Triggers 101
- Stored procedural objects that fire automatically on DB events (DML or DDL).
- Event + Condition + Action pattern.
- Use-cases: enforce referential / business rules, audit trails, replicate data, custom alerts.
- BEFORE vs. AFTER vs. INSTEAD OF timing.
- Row-level vs. statement-level (Oracle allows both; SQL Server only statement-level).
- DDL trigger example (prevents accidental schema change):
CREATE TRIGGER safety
ON DATABASE
FOR DROP_TABLE, ALTER_TABLE
AS
PRINT 'Disable "safety" trigger first.';
ROLLBACK;
- Caution: cascading chains, infinite loops, silent privilege escalations (security holes).
- Constraints are faster & simpler; prefer them when rule fits standard uniqueness / FK / CHECK semantics.
Routines & SQL/PSM (Persistent Stored Modules)
- Procedure: callable module; may have IN, OUT, INOUT params; returns 0–many result sets.
- Function: returns single scalar/aggregate; only IN params; can appear in SELECT list.
- Flow control constructs: CASE, IF-THEN/ELSEIF, LOOP/WHILE/REPEAT, FOR-each-row, LEAVE, ITERATE.
- Advantages: shared business logic, server-side execution, reduced client traffic.
- Vendor dialects:
- Oracle – PL/SQL
- SQL Server – T-SQL
- MySQL / MariaDB – stored routines
- PostgreSQL – PL/pgSQL, PL/v8, etc.
- Example (Oracle PL/SQL) – mark sale price:
ALTER TABLE Product_T ADD (SalePrice DECIMAL(6,2));
CREATE OR REPLACE PROCEDURE ProductLineSale AS
BEGIN
UPDATE Product_T
SET SalePrice = 0.90 * ProductStandardPrice
WHERE ProductStandardPrice >= 400;
UPDATE Product_T
SET SalePrice = 0.85 * ProductStandardPrice
WHERE ProductStandardPrice < 400;
END;
EXEC ProductLineSale;
- Demonstrates set-oriented bulk UPDATE vs. row-oriented host languages.
- Every RDBMS stores schema info in system tables (catalog), accessible via SELECT.
- Oracle 12c:
DBA_TABLES
, DBA_TAB_COLUMNS
, DBA_CONSTRAINTS
, DBA_USERS
, V$ performance views, USER/ALL/DBA prefix views. - SQL Server 2016:
sys.objects
, sys.columns
, dynamic management views (DMVs), information_schema, compatibility views.
- Example (Oracle):
SELECT OWNER, TABLE_NAME
FROM DBA_TABLES
WHERE TABLE_NAME = 'PRODUCT_T';
- Example (SQL Server): objects modified in last 10 days:
SELECT name, SCHEMA_NAME(schema_id) AS schema_name,
type_desc, create_date, modify_date
FROM sys.objects
WHERE modify_date > GETDATE() - 10;
SQL Standard Evolution Highlights
- SQL:1999 – added triggers, PSM, UDTs, flow control.
- SQL:2008 – standardized OLAP/statistical functions, new data types, preliminary SQL/XML.
- SQL:2011 – major temporal support: application-time, system-time, bi-temporal tables; predicates: CONTAINS, OVERLAPS, PRECEDES, SUCCEEDS, IMMEDIATELY … .
- Application-time table sample:
sql
CREATE TABLE Employee_T(
EmpNbr NUMBER(11,0),
EmpStart DATE,
EmpEnd DATE,
EmpDept NUMBER(11,0),
PERIOD FOR EmpPeriod (EmpStart, EmpEnd),
PRIMARY KEY (EmpNbr, EmpStart, EmpEnd)
);
- SQL:2016 – enhancements:
- Row Pattern Recognition (
MATCH_RECOGNIZE
) for time-series. - Trig/log functions, broader JSON support.
- Polymorphic Table Functions (PTFs) – table-valued functions whose output row-type is determined at runtime.
Other Noteworthy Additions
- MERGE (SQL:2008) – UPSERT in single statement.
MERGE INTO Customer_T AS Cust
USING (SELECT … FROM CustTrans_T) AS CT
ON (Cust.CustomerID = CT.CustomerID)
WHEN MATCHED THEN UPDATE …
WHEN NOT MATCHED THEN INSERT …;
- CREATE TABLE LIKE … INCLUDING COLUMN DEFAULTS – copy structure fast.
Analytical & OLAP Functions
- Part of SQL:2008 – built-in linear regression, correlation, moving averages, window aggregates.
- Calculations stay inside DB ⇒ minimizes ETL to analytical tools.
- Vendors converging on common syntax.
Quick Reference Cheat-Sheet
- Equi-join duplicates cols; Natural join drops them.
- LEFT / RIGHT / FULL OUTER keep unmatched rows with NULL placeholders.
- CROSS JOIN = Cartesian product.
- Subquery returns scalar / list / EXISTS boolean; correlated ⇒ inner depends on outer row.
- Derived table = subquery in FROM, can include aggregates.
- UNION / INTERSECT / MINUS combine sets; require column compatibility.
- CASE / NULLIF / COALESCE enable conditional or default logic.
- Views virtualize queries; materialized views cache them.
- Trigger = auto-fired; Routine = called; both written in SQL/PSM or vendor PL.
- SQL standards continually add analytics, temporal, JSON, pattern matching, PTF.
- Performance basics: match data types, refresh stats, fetch only needed columns, watch for Cartesian joins & correlated subqueries.