AH

Subqueries and Joins - Quick Reference

Subqueries

  • Subquery: an inner SELECT inside parentheses that runs first; its result is used by the outer query.
  • You cannot put an aggregate function directly in a WHERE clause; use a subquery to compute the aggregate and compare to its result.
  • Example 1 (scalar subquery in WHERE):
  SELECT * FROM inventory
  WHERE quantity_on_hand < (SELECT AVG(quantity_on_hand) FROM inventory);
  • Inner query computes the average; outer query selects rows based on that scalar result.
    • Example 2 (NOT IN with subquery):
  -- Conceptual: list of courses never visited
  SELECT * FROM your_records
  WHERE course_id NOT IN (SELECT course_id FROM catalog);
  • Example 3 (using subquery to generate a list for IN):
  -- Best practice: generate list of SKUs via subquery
  SELECT SUM(extended_price)
  FROM sales
  WHERE sku IN (SELECT sku FROM sku_data WHERE department = 'water sports');
  • The inner query produces a list; the outer query aggregates over that list.
    • Note:
  • Subqueries run inside-out; the inner data is produced first and then fed into the outer query.
  • There are more complex subquery types coming later (e.g., in chapter 8).

Joins

  • Implicit join vs explicit join:
    • Implicit joins are discouraged; use explicit JOIN … ON syntax.
  • INNER JOIN (the common type):
    • Combines two tables by matching related columns; you must specify how they relate with ON.
  • Example 1 (inventory + warehouse):
  SELECT *
  FROM inventory
  INNER JOIN warehouse
  ON inventory.warehouse_id = warehouse.warehouse_id;
  • Use table qualifiers to remove ambiguity: inventory.warehouseid vs warehouse.warehouseid.
  • Result typically shows columns from the left table first, then from the right table.
    • Example 2 (retailorder + orderitem):
  SELECT *
  FROM retail_order
  INNER JOIN order_item
  ON retail_order.order_number = order_item.item_order_number;
  • Using * can duplicate columns with the same name; consider selecting specific columns instead.
    • Practical tips:
  • You can spread the SELECT across multiple lines for readability.
  • If both tables have a column name common to both, qualify with table.column to remove ambiguity.
    • Rules (quick reference):
  • Use the JOIN keyword to combine two tables.
  • Use the ON keyword to describe how they relate.
  • Prefer table.column format to remove ambiguity; required when the same column name exists in both tables.
  • Equi-join: a common form where the join condition uses equality (e.g., primary key = foreign key).
  • Foreign keys/primary keys: the typical relationship in joins.
    • Table aliases (optional but useful):
  • For long table names or many tables, use aliases to shorten references:
  FROM retail_order AS ro
  JOIN order_item AS oi
  • Then reference as ro.storenumber, oi.somecolumn, etc.
    • Aliases in ORDER BY:
  • You can use table aliases in ORDER BY (e.g., ORDER BY ro.store_number).
    • Output considerations:
  • When using *, duplicates across joined tables will appear; specify needed columns to control output.
    • Final note:
  • Explicit joins with ON and proper aliases are the standard approach and more maintainable for complex queries.