Chapter 05 slides
Chapter 5: How to Code Summary Queries
Objectives
Applied Objectives:
Code summary queries using aggregate functions: AVG, SUM, MIN, MAX, and COUNT.
Group and summarize data with aggregate functions using GROUP BY and HAVING clauses.
Utilize ROLLUP, CUBE, GROUPING SETS operators, and the OVER clause for data summarization.
Knowledge Objectives:
Define summary queries.
Explain differences between HAVING and WHERE clauses.
Aggregate Functions Syntax
Common Aggregate Functions:
AVG:
AVG([ALL|DISTINCT] expression)
SUM:
SUM([ALL|DISTINCT] expression)
MIN:
MIN([ALL|DISTINCT] expression)
MAX:
MAX([ALL|DISTINCT] expression)
COUNT:
COUNT([ALL|DISTINCT] expression)
orCOUNT(*)
Summary Queries
Counting Unpaid Invoices:
SELECT COUNT(*) AS NumberOfInvoices, SUM(InvoiceTotal - PaymentTotal - CreditTotal) AS TotalDue FROM Invoices WHERE InvoiceTotal - PaymentTotal - CreditTotal > 0;
Summary with COUNT(*), AVG, and SUM:
SELECT 'After 7/1/2022' AS SelectionDate, COUNT(*) AS NumberOfInvoices, AVG(InvoiceTotal) AS AverageInvoiceAmount, SUM(InvoiceTotal) AS TotalInvoiceAmount FROM Invoices WHERE InvoiceDate > '2022-07-01';
Summary with MIN and MAX:
SELECT 'After 7/1/2022' AS SelectionDate, COUNT(*) AS NumberOfInvoices, MAX(InvoiceTotal) AS HighestInvoiceTotal, MIN(InvoiceTotal) AS LowestInvoiceTotal FROM Invoices WHERE InvoiceDate > '2022-07-01';
Summary for Non-Numeric Columns:
SELECT MIN(VendorName) AS FirstVendor, MAX(VendorName) AS LastVendor, COUNT(VendorName) AS NumberOfVendors FROM Vendors;
Summary with DISTINCT Keyword:
SELECT COUNT(DISTINCT VendorID) AS NumberOfVendors, COUNT(VendorID) AS NumberOfInvoices, AVG(InvoiceTotal) AS AverageInvoiceAmount, SUM(InvoiceTotal) AS TotalInvoiceAmount FROM Invoices WHERE InvoiceDate > '2022-07-01';
GROUP BY and HAVING Clauses
SELECT Statement Syntax with GROUP BY and HAVING:
SELECT select_list [FROM table_source] [WHERE search_condition] [GROUP BY group_by_list] [HAVING search_condition] [ORDER BY order_by_list];
Included in SELECT Clause:
Aggregate functions
Columns used for grouping
Constant value expressions
Examples of GROUP BY Queries
Average Invoice Amount by Vendor:
SELECT VendorID, AVG(InvoiceTotal) AS AverageInvoiceAmount FROM Invoices GROUP BY VendorID HAVING AVG(InvoiceTotal) > 2000 ORDER BY AverageInvoiceAmount DESC;
Count of Invoices by Vendor:
SELECT VendorID, COUNT(*) AS InvoiceQty FROM Invoices GROUP BY VendorID;
Grouping by Two Columns:
SELECT VendorState, VendorCity, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg FROM Invoices i JOIN Vendors v ON i.VendorID = v.VendorID GROUP BY VendorState, VendorCity ORDER BY VendorState, VendorCity;
HAVING Clause with Limitations:
SELECT VendorState, VendorCity, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg FROM Invoices i JOIN Vendors v ON i.VendorID = v.VendorID GROUP BY VendorState, VendorCity HAVING COUNT(*) >= 2 ORDER BY VendorState, VendorCity;
HAVING Clause with Search Condition:
SELECT VendorName, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg FROM Vendors JOIN Invoices ON Vendors.VendorID = Invoices.VendorID GROUP BY VendorName HAVING AVG(InvoiceTotal) > 500 ORDER BY InvoiceQty DESC;
WHERE Clause in Summary Query:
SELECT VendorName, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg FROM Vendors JOIN Invoices ON Vendors.VendorID = Invoices.VendorID WHERE InvoiceTotal > 500 GROUP BY VendorName ORDER BY InvoiceQty DESC;
Compound Condition in HAVING Clause:
SELECT InvoiceDate, COUNT(*) AS InvoiceQty, SUM(InvoiceTotal) AS InvoiceSum FROM Invoices GROUP BY InvoiceDate HAVING InvoiceDate BETWEEN '2023-01-01' AND '2023-01-31' AND COUNT(*) > 1 AND SUM(InvoiceTotal) > 100 ORDER BY InvoiceDate DESC;
ROLLUP, CUBE, and GROUPING SETS Operators
Using ROLLUP Operator:
SELECT VendorID, COUNT(*) AS InvoiceCount, SUM(InvoiceTotal) AS InvoiceTotal FROM Invoices GROUP BY ROLLUP(VendorID);
Summary Row for Each Grouping Level:
SELECT VendorState, VendorCity, COUNT(*) AS QtyVendors FROM Vendors WHERE VendorState IN ('IA', 'NJ') GROUP BY ROLLUP(VendorState, VendorCity) ORDER BY VendorState DESC, VendorCity DESC;
Using CUBE Operator:
SELECT VendorID, COUNT(*) AS InvoiceCount, SUM(InvoiceTotal) AS InvoiceTotal FROM Invoices GROUP BY CUBE(VendorID);
Summary Row for Each Set of Groups:
SELECT VendorState, VendorCity, COUNT(*) AS QtyVendors FROM Vendors WHERE VendorState IN ('IA', 'NJ') GROUP BY CUBE(VendorState, VendorCity) ORDER BY VendorState DESC, VendorCity DESC;
Grouping with GROUPING SETS:
SELECT VendorState, VendorCity, COUNT(*) AS QtyVendors FROM Vendors WHERE VendorState IN ('IA', 'NJ') GROUP BY GROUPING SETS(VendorState, VendorCity) ORDER BY VendorState DESC, VendorCity DESC;
OVER Clause Syntax
OVER Clause Syntax for Summary Queries:
aggregate_function OVER ([partition_by_clause] [order_by_clause])
Examples Using OVER Clause
Grouping Summary Data by Date:
SELECT InvoiceNumber, InvoiceDate, InvoiceTotal, SUM(InvoiceTotal) OVER (PARTITION BY InvoiceDate) AS DateTotal, COUNT(InvoiceTotal) OVER (PARTITION BY InvoiceDate) AS DateCount, AVG(InvoiceTotal) OVER (PARTITION BY InvoiceDate) AS DateAvg FROM Invoices;
Cumulative Total and Moving Average:
SELECT InvoiceNumber, InvoiceDate, InvoiceTotal, SUM(InvoiceTotal) OVER (ORDER BY InvoiceDate) AS CumTotal, COUNT(InvoiceTotal) OVER (ORDER BY InvoiceDate) AS Count, AVG(InvoiceTotal) OVER (ORDER BY InvoiceDate) AS MovingAvg FROM Invoices;
Grouping by TermsID:
SELECT InvoiceNumber, TermsID, InvoiceDate, InvoiceTotal, SUM(InvoiceTotal) OVER (PARTITION BY TermsID ORDER BY InvoiceDate) AS CumTotal, COUNT(InvoiceTotal) OVER (PARTITION BY TermsID ORDER BY InvoiceDate) AS Count, AVG(InvoiceTotal) OVER (PARTITION BY TermsID ORDER BY InvoiceDate) AS MovingAvg FROM Invoices;