Module 2 – Tables, Sorting & Filtering (Comprehensive Notes)
Module Scope & Learning Objectives
- Covers content from Evan’s “Business Analytics: Methods, Models, and Decisions,” 3rd Ed., Ch. 2, Module 2: “Tables, Sorting & Filtering.”
- By the end of the module you should be able to:
- Create and name Excel data Tables.
- Recognize different business‐dataset structures.
- Manipulate data via range names, Table formulas, sorting, basic & advanced filtering.
- Perform exploratory analysis on large datasets.
- Conduct Pareto (80–20) Analysis to locate the few items that explain most of an outcome.
- Appreciate real-world impact through two Analytics-in-Practice cases (Town of Cary; Allders International).
Fundamental Concepts
- Dataset
- A finite collection of related data values.
- Common examples: marketing survey answers, historical stock price table, production-line measurement log.
- Database
- Highly structured set of related files containing records (rows) and fields / attributes (columns).
- Physically resembles a two-dimensional table: each row = 1 record about a “thing,” each column = 1 variable describing that “thing.”
- Record vs. Field
- Record = entire row (e.g., a single purchase order).
- Field = individual data element (e.g., Order No., Item Cost).
- Range Names
- Human-readable aliases for rectangular blocks of cells.
- Useful for quick formulas, improved readability, and error reduction.
- Example set-up:
- Name each row by month: January, February, …
- Name each column by product: Product A, Product B, …
- Retrieve annual sales for Product A: =SUM(ProductA) (sums the entire named column regardless of future row additions).
Big-Data Illustration – Town of Cary, NC
- 2010 installation of Aquastar wireless water meters for 60,000 customers.
- Hourly reads → 8,760 observations / customer / year vs. 12 monthly reads.
- Aggregate ≈ 5×108 (half-billion) data points annually.
- SAS visual analytics used to:
- Notify households of leaks within days / hours.
- Allow self-service usage dashboards & custom alerts.
- Improve conservation programs & infrastructure planning (capacity expansions or drought response).
- Financial impact: >$10 million in post-project savings (elimination of manual reads) plus additional avoided costs via leak detection.
Creating an Excel Table
- Select entire data block (include header row).
- Ribbon: Insert → Table
- Ensure “My Table Has Headers” is checked → Enter.
- Default name sequence: Table1, Table2, …
- Rename: Table Design → Table Name → type new descriptive name → Enter.
Advantages
- Structured references (no $A$1 notation) in formulas, e.g., =SUM(SalesTbl[Cost]).
- Auto-expands formulas & formatting when rows/columns are appended.
- Filters & slicers attach instantly.
Table-Based Calculations (Example 2.4)
- Sum an entire numeric column: \text{=SUM}(CreditRisk[#All][Loan\ Amount]).
- Count categorical condition (e.g., number of homeowners): =COUNTIF(CreditRisk[Homeowner], "Yes").
- Benefits: live-update when new records are appended.
Sorting Data
- Simple sort: click any cell in target column (not the header) then Data → Sort A→Z (ascending) or Sort Z→A (descending).
- Advanced multi-key sort: Data → Sort opens dialog; stack sort levels (e.g., Supplier ascending, then Order Date descending).
- Example 2.5 sorts Purchase-Orders by Supplier to cluster related vendors.
Pareto Analysis
- Origin: Vilfredo Pareto (1906) – ~80 % of Italy’s wealth held by ~20 % of citizens.
- Business parallel: small percentage of customers/products drive majority of sales/profits.
- Steps (Example 2.6 – Bicycle Inventory):
- Compute each item’s Inventory Value =(Quantity on Hand)×(Purchase Cost).
- Sort items in descending Inventory Value.
- Compute % of total for each row: Pct<em>i=∑<em>jValue</em>jValue</em>i.
- Compute Cumulative %: CumPct<em>i=∑</em>k=1iPctk.
- Mark cut-off where CumPct ≈ 80% to isolate the “vital few.”
Allders International Case
- 82 duty-free outlets (airports, seaports, ferries).
- Massive POS data moved into a data warehouse → rapid queries.
- Pareto on product lines: ≈20 % of lines generated 80 % of profit.
- Action: drop under-performers, free shelf space, cut inventory & supplier costs.
Filtering Data
AutoFilter (simple criteria)
- Place cursor inside Table/database.
- Data → Filter toggles drop-down arrows on headers.
- Text filters, number filters, date filters accessible per column.
Example 2.7 – Item Description filter
- Goal: Show only Bolt-nut package orders.
- Steps: Column D drop-down → uncheck all → tick “Bolt-nut package” → OK.
- Restore: same menu → “Clear filter from ‘Item Description’.”
Example 2.8 – Numeric threshold filter
- Objective: Item Cost ≥ $200.
- Column “Item Cost” drop-down → Number Filters → Greater Than… → enter 200 → OK.
- Uses Custom AutoFilter dialog with AND/OR logic (Windows); Mac lacks visible operators but logic still applied implicitly.
Advanced Filter (complex multi-field, multi-condition logic)
- Layout requirements:
- Copy the exact header row to a spare part of worksheet (criteria range).
- Under copied headers, enter criteria values.
- Same row ⇒ AND logic.
- Different rows ⇒ OR logic.
- Execution: Data → Advanced → specify List Range (DB), Criteria Range (headers + criteria rows).
Example 2.9 Criteria Table
| Supplier | Order Qty | Order Date |
|---|
| Hulkey Fasteners | >5000 | <(date1) |
| Steelpin Inc. | <5000 | <(date1) |
- After Advanced Filter only matching records display.
Caution When Filtering
- AutoFilter merely hides non-matching rows; hidden rows are still part of worksheet calculations.
- Summing a visible block via =SUM(B2:B30) will include hidden values.
- Best practice: copy filtered rows to a new sheet or range before aggregating.
- Shortcut within Advanced Filter dialog: “Copy to another location.”
Practical & Ethical Connections
- Data quality & timeliness can influence public policy (water conservation) and profitability (retail mix management).
- Pareto insights can lead to downsizing of SKUs or focused marketing, but may neglect niche customer needs—balance required.
- Responsible analytics must consider privacy (e.g., hourly water data could reveal occupancy patterns).
- Total Column (Table): =SUM(TableName[Field])
- Conditional Count: =COUNTIF(Range, Criterion)
- Conditional Sum: =SUMIF(Range, Criterion, SumRange)
- Structured reference arithmetic example: =Inventory[Qty On Hand]*Inventory[Purchase Cost] (entered in calculated column – auto-fills).
Summary Checklist
- [ ] Created Excel Table & renamed it.
- [ ] Applied structured formulas; verified auto-extend.
- [ ] Performed ascending/descending sorts; used multi-level Sort dialog.
- [ ] Executed AutoFilter on text, numeric, date fields; cleared filters.
- [ ] Built Pareto table with % and cumulative %; identified “vital few.”
- [ ] Designed Criteria Range; ran Advanced Filter with AND/OR logic.
- [ ] Avoided calculation pitfalls on hidden rows; copied filtered dataset when needed.
- [ ] Reflected on real‐world case insights and ethical considerations.