J

Chapter 05 slides

Chapter 5: How to Code Summary Queries

Objectives

  • Applied Objectives:

    • Code summary queries using aggregate functions: AVG, SUM, MIN, MAX, and COUNT.

    • Group and summarize data with aggregate functions using GROUP BY and HAVING clauses.

    • Utilize ROLLUP, CUBE, GROUPING SETS operators, and the OVER clause for data summarization.

  • Knowledge Objectives:

    • Define summary queries.

    • Explain differences between HAVING and WHERE clauses.

Aggregate Functions Syntax

  • Common Aggregate Functions:

    • AVG: AVG([ALL|DISTINCT] expression)

    • SUM: SUM([ALL|DISTINCT] expression)

    • MIN: MIN([ALL|DISTINCT] expression)

    • MAX: MAX([ALL|DISTINCT] expression)

    • COUNT: COUNT([ALL|DISTINCT] expression) or COUNT(*)

Summary Queries

  • Counting Unpaid Invoices:

    SELECT COUNT(*) AS NumberOfInvoices, SUM(InvoiceTotal - PaymentTotal - CreditTotal) AS TotalDue 
    FROM Invoices 
    WHERE InvoiceTotal - PaymentTotal - CreditTotal > 0;
  • Summary with COUNT(*), AVG, and SUM:

    SELECT 'After 7/1/2022' AS SelectionDate, 
    COUNT(*) AS NumberOfInvoices, 
    AVG(InvoiceTotal) AS AverageInvoiceAmount, 
    SUM(InvoiceTotal) AS TotalInvoiceAmount 
    FROM Invoices 
    WHERE InvoiceDate > '2022-07-01';
  • Summary with MIN and MAX:

    SELECT 'After 7/1/2022' AS SelectionDate, 
    COUNT(*) AS NumberOfInvoices, 
    MAX(InvoiceTotal) AS HighestInvoiceTotal, 
    MIN(InvoiceTotal) AS LowestInvoiceTotal 
    FROM Invoices 
    WHERE InvoiceDate > '2022-07-01';
  • Summary for Non-Numeric Columns:

    SELECT MIN(VendorName) AS FirstVendor, 
    MAX(VendorName) AS LastVendor, 
    COUNT(VendorName) AS NumberOfVendors 
    FROM Vendors;
  • Summary with DISTINCT Keyword:

    SELECT COUNT(DISTINCT VendorID) AS NumberOfVendors, 
    COUNT(VendorID) AS NumberOfInvoices, 
    AVG(InvoiceTotal) AS AverageInvoiceAmount, 
    SUM(InvoiceTotal) AS TotalInvoiceAmount 
    FROM Invoices 
    WHERE InvoiceDate > '2022-07-01';

GROUP BY and HAVING Clauses

  • SELECT Statement Syntax with GROUP BY and HAVING:

    SELECT select_list 
    [FROM table_source] 
    [WHERE search_condition] 
    [GROUP BY group_by_list] 
    [HAVING search_condition] 
    [ORDER BY order_by_list];
  • Included in SELECT Clause:

    • Aggregate functions

    • Columns used for grouping

    • Constant value expressions

Examples of GROUP BY Queries

  • Average Invoice Amount by Vendor:

    SELECT VendorID, AVG(InvoiceTotal) AS AverageInvoiceAmount 
    FROM Invoices 
    GROUP BY VendorID 
    HAVING AVG(InvoiceTotal) > 2000 
    ORDER BY AverageInvoiceAmount DESC;
  • Count of Invoices by Vendor:

    SELECT VendorID, COUNT(*) AS InvoiceQty 
    FROM Invoices 
    GROUP BY VendorID;
  • Grouping by Two Columns:

    SELECT VendorState, VendorCity, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg 
    FROM Invoices i 
    JOIN Vendors v ON i.VendorID = v.VendorID 
    GROUP BY VendorState, VendorCity 
    ORDER BY VendorState, VendorCity;
  • HAVING Clause with Limitations:

    SELECT VendorState, VendorCity, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg 
    FROM Invoices i 
    JOIN Vendors v ON i.VendorID = v.VendorID 
    GROUP BY VendorState, VendorCity 
    HAVING COUNT(*) >= 2 
    ORDER BY VendorState, VendorCity;
  • HAVING Clause with Search Condition:

    SELECT VendorName, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg 
    FROM Vendors 
    JOIN Invoices ON Vendors.VendorID = Invoices.VendorID 
    GROUP BY VendorName 
    HAVING AVG(InvoiceTotal) > 500 
    ORDER BY InvoiceQty DESC;
  • WHERE Clause in Summary Query:

    SELECT VendorName, COUNT(*) AS InvoiceQty, AVG(InvoiceTotal) AS InvoiceAvg 
    FROM Vendors 
    JOIN Invoices ON Vendors.VendorID = Invoices.VendorID 
    WHERE InvoiceTotal > 500 
    GROUP BY VendorName 
    ORDER BY InvoiceQty DESC;
  • Compound Condition in HAVING Clause:

    SELECT InvoiceDate, COUNT(*) AS InvoiceQty, SUM(InvoiceTotal) AS InvoiceSum 
    FROM Invoices 
    GROUP BY InvoiceDate 
    HAVING InvoiceDate BETWEEN '2023-01-01' AND '2023-01-31' 
    AND COUNT(*) > 1 
    AND SUM(InvoiceTotal) > 100 
    ORDER BY InvoiceDate DESC;

ROLLUP, CUBE, and GROUPING SETS Operators

  • Using ROLLUP Operator:

    SELECT VendorID, COUNT(*) AS InvoiceCount, SUM(InvoiceTotal) AS InvoiceTotal 
    FROM Invoices 
    GROUP BY ROLLUP(VendorID);
  • Summary Row for Each Grouping Level:

    SELECT VendorState, VendorCity, COUNT(*) AS QtyVendors 
    FROM Vendors 
    WHERE VendorState IN ('IA', 'NJ') 
    GROUP BY ROLLUP(VendorState, VendorCity) 
    ORDER BY VendorState DESC, VendorCity DESC;
  • Using CUBE Operator:

    SELECT VendorID, COUNT(*) AS InvoiceCount, SUM(InvoiceTotal) AS InvoiceTotal 
    FROM Invoices 
    GROUP BY CUBE(VendorID);
  • Summary Row for Each Set of Groups:

    SELECT VendorState, VendorCity, COUNT(*) AS QtyVendors 
    FROM Vendors 
    WHERE VendorState IN ('IA', 'NJ') 
    GROUP BY CUBE(VendorState, VendorCity) 
    ORDER BY VendorState DESC, VendorCity DESC;
  • Grouping with GROUPING SETS:

    SELECT VendorState, VendorCity, COUNT(*) AS QtyVendors 
    FROM Vendors 
    WHERE VendorState IN ('IA', 'NJ') 
    GROUP BY GROUPING SETS(VendorState, VendorCity) 
    ORDER BY VendorState DESC, VendorCity DESC;

OVER Clause Syntax

  • OVER Clause Syntax for Summary Queries: aggregate_function OVER ([partition_by_clause] [order_by_clause])

Examples Using OVER Clause

  • Grouping Summary Data by Date:

    SELECT InvoiceNumber, InvoiceDate, InvoiceTotal, 
    SUM(InvoiceTotal) OVER (PARTITION BY InvoiceDate) AS DateTotal, 
    COUNT(InvoiceTotal) OVER (PARTITION BY InvoiceDate) AS DateCount, 
    AVG(InvoiceTotal) OVER (PARTITION BY InvoiceDate) AS DateAvg 
    FROM Invoices;
  • Cumulative Total and Moving Average:

    SELECT InvoiceNumber, InvoiceDate, InvoiceTotal, 
    SUM(InvoiceTotal) OVER (ORDER BY InvoiceDate) AS CumTotal, 
    COUNT(InvoiceTotal) OVER (ORDER BY InvoiceDate) AS Count, 
    AVG(InvoiceTotal) OVER (ORDER BY InvoiceDate) AS MovingAvg 
    FROM Invoices;
  • Grouping by TermsID:

    SELECT InvoiceNumber, TermsID, InvoiceDate, InvoiceTotal, 
    SUM(InvoiceTotal) OVER (PARTITION BY TermsID ORDER BY InvoiceDate) AS CumTotal, 
    COUNT(InvoiceTotal) OVER (PARTITION BY TermsID ORDER BY InvoiceDate) AS Count, 
    AVG(InvoiceTotal) OVER (PARTITION BY TermsID ORDER BY InvoiceDate) AS MovingAvg 
    FROM Invoices;