SQL Joins, GROUP BY, and Derived Tables Notes

SQL Joins

SQL Joins are a fundamental component of relational databases, allowing for the combination of rows from two or more tables based on a related column. The general syntax for a SQL JOIN operation is:

SELECT {column_list} FROM tableName1 {joinType} tableName2 ON {joinCondition} WHERE conditions;

There are different types of SQL JOINS:

INNER JOIN

This join returns rows when there is a match in both tables. For example, to find all employees working in at least one department, you could use:

SELECT e.eid FROM employee e JOIN works_in w ON e.eid = w.eid;

Here, e is an alias for the employee table, and w is for the works_in table.

OUTER JOIN (LEFT/RIGHT)

OUTER JOIN retrieves all rows from one table and the matching rows from the other. If there’s no match, NULL values are included. An example query to find all employees who are not working in any department would be:

SELECT e.eid FROM employee e LEFT JOIN works_in w ON e.eid = w.eid WHERE w.did IS NULL;

This query effectively filters out the employees without a corresponding entry in the works_in table.

SELF JOIN [wont come on exam]

A query can also join a table to itself. This is useful for comparing rows within the same table. For instance, to find each employee’s manager:

SELECT e.ename AS 'Employee Name', m.ename AS 'Manager Name' FROM employee e JOIN employee m ON e.mgrid = m.eid;
/*this wont come on the exam*/

UPDATE/DELETE with JOIN

SQL allows for JOINs in DELETE or UPDATE statements, enabling modifications across multiple tables. For a DELETE operation, the syntax is:

DELETE t1, t2 FROM table1 t1 JOIN table2 t2 ON t1.tid = t2.tid WHERE {condition};

For example, to delete all records for an employee with eid 1:

DELETE e, w FROM employee e JOIN works_in w ON e.eid = w.eid WHERE e.eid = 1;

When updating records across tables, the syntax is:

UPDATE Table1 t1 JOIN Table2 t2 ON t1.C1 = t2.C1 SET t2.C1 = value WHERE {condition};

An example would be updating a department’s ID:

UPDATE department d JOIN works_in w ON d.did = w.did SET d.did = 105 WHERE d.did = 103;

GROUP BY

The GROUP BY clause groups rows that have the same values in specified columns, facilitating the use of aggregate functions (like COUNT, AVG, SUM). The general syntax is:

SELECT column, aggregate_function(column) FROM table GROUP BY column;

For example, to count employees by gender:

SELECT gender, COUNT(eid) FROM employee GROUP BY (gender);

An additional example to show the average salary per gender, excluding managers, would be:

SELECT gender, AVG(salary) FROM employee WHERE mgrid IS NOT NULL GROUP BY (gender);
GROUP BY with HAVING

The HAVING clause is used to filter group results. It comes after GROUP BY. For instance, to show departments with more than two employees:

SELECT dname, COUNT(eid) FROM department d JOIN works_in w ON d.did = w.did GROUP BY (dname) HAVING COUNT(eid) > 2;

Derived Tables

Derived tables are subqueries defined within the FROM clause of a query. They allow for temporary tables to be created within the SQL context, which can be referenced similarly to standard tables. The syntax is:

SELECT ... FROM (SELECT ... FROM ... WHERE ...) AS derived_table_name WHERE ...;

For example, to find employee names and their departments for those in departments with more than one employee, one could use:

SELECT e.ename, w.did FROM employee e, works_in w, (SELECT d.did AS did FROM department d JOIN works_in w ON d.did = w.did GROUP BY (d.did) HAVING COUNT(eid) > 1) AS dep WHERE e.eid = w.eid AND dep.did = w.did;