Subqueries and Joins - Quick Reference
Subqueries
- Subquery: an inner SELECT inside parentheses that runs first; its result is used by the outer query.
- You cannot put an aggregate function directly in a WHERE clause; use a subquery to compute the aggregate and compare to its result.
- Example 1 (scalar subquery in WHERE):
SELECT * FROM inventory
WHERE quantity_on_hand < (SELECT AVG(quantity_on_hand) FROM inventory);
- Inner query computes the average; outer query selects rows based on that scalar result.
- Example 2 (NOT IN with subquery):
-- Conceptual: list of courses never visited
SELECT * FROM your_records
WHERE course_id NOT IN (SELECT course_id FROM catalog);
- Example 3 (using subquery to generate a list for IN):
-- Best practice: generate list of SKUs via subquery
SELECT SUM(extended_price)
FROM sales
WHERE sku IN (SELECT sku FROM sku_data WHERE department = 'water sports');
- The inner query produces a list; the outer query aggregates over that list.
- Subqueries run inside-out; the inner data is produced first and then fed into the outer query.
- There are more complex subquery types coming later (e.g., in chapter 8).
Joins
- Implicit join vs explicit join:
- Implicit joins are discouraged; use explicit JOIN … ON syntax.
- INNER JOIN (the common type):
- Combines two tables by matching related columns; you must specify how they relate with ON.
- Example 1 (inventory + warehouse):
SELECT *
FROM inventory
INNER JOIN warehouse
ON inventory.warehouse_id = warehouse.warehouse_id;
- Use table qualifiers to remove ambiguity: inventory.warehouseid vs warehouse.warehouseid.
- Result typically shows columns from the left table first, then from the right table.
- Example 2 (retailorder + orderitem):
SELECT *
FROM retail_order
INNER JOIN order_item
ON retail_order.order_number = order_item.item_order_number;
- Using * can duplicate columns with the same name; consider selecting specific columns instead.
- You can spread the SELECT across multiple lines for readability.
- If both tables have a column name common to both, qualify with table.column to remove ambiguity.
- Use the JOIN keyword to combine two tables.
- Use the ON keyword to describe how they relate.
- Prefer table.column format to remove ambiguity; required when the same column name exists in both tables.
- Equi-join: a common form where the join condition uses equality (e.g., primary key = foreign key).
- Foreign keys/primary keys: the typical relationship in joins.
- Table aliases (optional but useful):
- For long table names or many tables, use aliases to shorten references:
FROM retail_order AS ro
JOIN order_item AS oi
- Then reference as ro.storenumber, oi.somecolumn, etc.
- You can use table aliases in ORDER BY (e.g., ORDER BY ro.store_number).
- When using *, duplicates across joined tables will appear; specify needed columns to control output.
- Explicit joins with ON and proper aliases are the standard approach and more maintainable for complex queries.