Module 2 – Tables, Sorting & Filtering (Comprehensive Notes)

Module Scope & Learning Objectives

  • Covers content from Evan’s “Business Analytics: Methods, Models, and Decisions,” 3rd Ed., Ch. 2, Module 2: “Tables, Sorting & Filtering.”
  • By the end of the module you should be able to:
    • Create and name Excel data Tables.
    • Recognize different business‐dataset structures.
    • Manipulate data via range names, Table formulas, sorting, basic & advanced filtering.
    • Perform exploratory analysis on large datasets.
    • Conduct Pareto (80–20) Analysis to locate the few items that explain most of an outcome.
    • Appreciate real-world impact through two Analytics-in-Practice cases (Town of Cary; Allders International).

Fundamental Concepts

  • Dataset
    • A finite collection of related data values.
    • Common examples: marketing survey answers, historical stock price table, production-line measurement log.
  • Database
    • Highly structured set of related files containing records (rows) and fields / attributes (columns).
    • Physically resembles a two-dimensional table: each row = 1 record about a “thing,” each column = 1 variable describing that “thing.”
  • Record vs. Field
    • Record = entire row (e.g., a single purchase order).
    • Field = individual data element (e.g., Order No., Item Cost).
  • Range Names
    • Human-readable aliases for rectangular blocks of cells.
    • Useful for quick formulas, improved readability, and error reduction.
    • Example set-up:
    • Name each row by month: January, February, …
    • Name each column by product: Product A, Product B, …
    • Retrieve annual sales for Product A: =SUM(ProductA)\text{=SUM}(Product_A) (sums the entire named column regardless of future row additions).

Big-Data Illustration – Town of Cary, NC

  • 2010 installation of Aquastar wireless water meters for 60,00060{,}000 customers.
  • Hourly reads → 8,7608{,}760 observations / customer / year vs. 12 monthly reads.
  • Aggregate ≈ 5×1085\times10^8 (half-billion) data points annually.
  • SAS visual analytics used to:
    • Notify households of leaks within days / hours.
    • Allow self-service usage dashboards & custom alerts.
    • Improve conservation programs & infrastructure planning (capacity expansions or drought response).
  • Financial impact: >$10 million\$10\text{ million} in post-project savings (elimination of manual reads) plus additional avoided costs via leak detection.

Creating an Excel Table

  1. Select entire data block (include header row).
  2. Ribbon: Insert → Table
  3. Ensure “My Table Has Headers” is checked → Enter.
  4. Default name sequence: Table1, Table2, …
  5. Rename: Table Design → Table Name → type new descriptive name → Enter.
Advantages
  • Structured references (no $A$1 notation) in formulas, e.g., =SUM(SalesTbl[Cost])\text{=SUM}(Sales_Tbl[Cost]).
  • Auto-expands formulas & formatting when rows/columns are appended.
  • Filters & slicers attach instantly.

Table-Based Calculations (Example 2.4)

  • Sum an entire numeric column: \text{=SUM}(CreditRisk[#All][Loan\ Amount]).
  • Count categorical condition (e.g., number of homeowners): =COUNTIF(CreditRisk[Homeowner], "Yes")\text{=COUNTIF}(CreditRisk[Homeowner],\ "Yes").
  • Benefits: live-update when new records are appended.

Sorting Data

  • Simple sort: click any cell in target column (not the header) then Data → Sort A→Z (ascending) or Sort Z→A (descending).
  • Advanced multi-key sort: Data → Sort opens dialog; stack sort levels (e.g., Supplier ascending, then Order Date descending).
  • Example 2.5 sorts Purchase-Orders by Supplier to cluster related vendors.

Pareto Analysis

  • Origin: Vilfredo Pareto (1906) – ~80 % of Italy’s wealth held by ~20 % of citizens.
  • Business parallel: small percentage of customers/products drive majority of sales/profits.
  • Steps (Example 2.6 – Bicycle Inventory):
    1. Compute each item’s Inventory Value =(Quantity on Hand)×(Purchase Cost)=(\text{Quantity on Hand})\times(\text{Purchase Cost}).
    2. Sort items in descending Inventory Value.
    3. Compute % of total for each row: Pct<em>i=Value</em>i<em>jValue</em>j\text{Pct}<em>i = \dfrac{\text{Value}</em>i}{\sum\limits<em>{j} \text{Value}</em>j}.
    4. Compute Cumulative %: CumPct<em>i=</em>k=1iPctk\text{CumPct}<em>i = \sum\limits</em>{k=1}^{i} \text{Pct}_k.
    5. Mark cut-off where CumPct ≈ 80%80\% to isolate the “vital few.”
Allders International Case
  • 82 duty-free outlets (airports, seaports, ferries).
  • Massive POS data moved into a data warehouse → rapid queries.
  • Pareto on product lines: ≈20 % of lines generated 80 % of profit.
  • Action: drop under-performers, free shelf space, cut inventory & supplier costs.

Filtering Data

AutoFilter (simple criteria)
  1. Place cursor inside Table/database.
  2. Data → Filter toggles drop-down arrows on headers.
  3. Text filters, number filters, date filters accessible per column.
Example 2.7 – Item Description filter
  • Goal: Show only Bolt-nut package orders.
  • Steps: Column D drop-down → uncheck all → tick “Bolt-nut package” → OK.
  • Restore: same menu → “Clear filter from ‘Item Description’.”
Example 2.8 – Numeric threshold filter
  • Objective: Item Cost ≥ $200\$200.
  • Column “Item Cost” drop-down → Number Filters → Greater Than… → enter 200 → OK.
  • Uses Custom AutoFilter dialog with AND/OR logic (Windows); Mac lacks visible operators but logic still applied implicitly.
Advanced Filter (complex multi-field, multi-condition logic)
  • Layout requirements:
    1. Copy the exact header row to a spare part of worksheet (criteria range).
    2. Under copied headers, enter criteria values.
    • Same row ⇒ AND logic.
    • Different rows ⇒ OR logic.
  • Execution: Data → Advanced → specify List Range (DB), Criteria Range (headers + criteria rows).
Example 2.9 Criteria Table
SupplierOrder QtyOrder Date
Hulkey Fasteners>5000<(date1)
Steelpin Inc.<5000<(date1)
  • After Advanced Filter only matching records display.

Caution When Filtering

  • AutoFilter merely hides non-matching rows; hidden rows are still part of worksheet calculations.
    • Summing a visible block via =SUM(B2:B30)\text{=SUM}(B2:B30) will include hidden values.
  • Best practice: copy filtered rows to a new sheet or range before aggregating.
    • Shortcut within Advanced Filter dialog: “Copy to another location.”

Practical & Ethical Connections

  • Data quality & timeliness can influence public policy (water conservation) and profitability (retail mix management).
  • Pareto insights can lead to downsizing of SKUs or focused marketing, but may neglect niche customer needs—balance required.
  • Responsible analytics must consider privacy (e.g., hourly water data could reveal occupancy patterns).

Quick Reference Formula Library

  • Total Column (Table): =SUM(TableName[Field])\text{=SUM}(TableName[Field])
  • Conditional Count: =COUNTIF(Range, Criterion)\text{=COUNTIF}(Range,\ Criterion)
  • Conditional Sum: =SUMIF(Range, Criterion, SumRange)\text{=SUMIF}(Range,\ Criterion,\ Sum_Range)
  • Structured reference arithmetic example: =Inventory[Qty On Hand]*Inventory[Purchase Cost]\text{=Inventory[Qty\ On\ Hand]*Inventory[Purchase\ Cost]} (entered in calculated column – auto-fills).

Summary Checklist

  • [ ] Created Excel Table & renamed it.
  • [ ] Applied structured formulas; verified auto-extend.
  • [ ] Performed ascending/descending sorts; used multi-level Sort dialog.
  • [ ] Executed AutoFilter on text, numeric, date fields; cleared filters.
  • [ ] Built Pareto table with % and cumulative %; identified “vital few.”
  • [ ] Designed Criteria Range; ran Advanced Filter with AND/OR logic.
  • [ ] Avoided calculation pitfalls on hidden rows; copied filtered dataset when needed.
  • [ ] Reflected on real‐world case insights and ethical considerations.