AH

Chapter 2: SQL Overview — Key Concepts and Terms

Quick Course Orientation

  • Chapter 2 = SQL overview focused on reporting data from a database.
  • Not a full SQL command guide; emphasis on ad hoc queries for retrieving data.

Core SQL Concepts

  • Ad hoc query: type it on the command line (as opposed to a canned/query saved for you).
  • Canned query: prebuilt query; not the focus for learning here.
  • Single-table retrieval is straightforward; grouping data (e.g., average, sum) is important.
  • Use of comparison and logical operators (e.g., first name = 'Bob') is essential.
  • Retrieving data from multiple tables is not covered in this video.

Learning Approach and Resources

  • GUI-based query tools exist but are not emphasized for learning.
  • Query by Example is neat but not ideal for building foundational SQL skills.
  • Consider additional resources (e.g., SQL for Dummies) for broader practice.

OLTP, BI, and Data Warehousing: Big Picture

  • Business intelligence (BI) deals with turning lots of data into insights.
  • OLTP (Online Transaction Processing): live transactional systems (multiple stores, online orders) feeding a single database.
  • BI and data warehousing distinguish operational data from historical analysis.
  • ETL = Extract, Transform, Load: thinning data to reduce volume before loading into BI systems.
  • BI users are typically analysts or executives performing what-if analyses.

Data Architecture: ETL, Warehouses, and Data Marts

  • Operational database plus internal/external data sources feed BI.
  • Data warehouse: separate machine tuned for read-heavy analytics; often not on the live system.
  • Historical data is typical in data warehouses; often read-only after extraction.
  • Data mart: a small, focused data warehouse for a specific need (e.g., location-based analysis).

Cape Cod Outdoor Sports: OLTP Example and Tables

  • OLTP system handles retail orders, multiple order line items per order.
  • Key tables: Order (header), Order Item (line items), SKU data (inventory).
  • Orphan tables: historical catalogs (e.g., 2020, 2021) not live, not linked to current data.

Relational Notation (Text-Based)

  • Notation concept reinforced on page 52-55: table notation uses table name and a list of columns in parentheses.
  • Primary key (PK) is indicated by underlining in the notation.
  • Foreign keys (FK) are italicized; a column can be both PK and FK.
  • Mouthful note: the side of a foreign key that references the other table doesn’t have a named “receptor” in the notation.

Schema: What it Is and How it Differs

  • Schema = collection of tables and their relational notations.
  • Different from Oracle’s schema, which is more like a user profile; here it’s about the structure of the data (tables and relationships).
  • A schema helps describe how tables connect and where primary keys live.

Relational Design, Normalization, and Joining

  • Relational design favors splitting data into multiple related tables (normalization) rather than one big table.
  • Normalization: process of dividing a table into smaller, related tables to reduce redundancy.
  • Joining: how you combine related tables back together to reconstruct the original view.
  • This will be covered in more depth in Chapter 3.

SQL Standards and History

  • SQL standardization is driven by committees; debates over proper syntax for clauses like FROM and SELECT.
  • Historically, US ANSI character set; ISO standard aimed for global compatibility.
  • After 1996, many vendors moved to self-certification for compliance, making conformance harder to verify.
  • To manage changes, the standard was broken into smaller chunks for faster releases.

Quick Wrap-Up

  • The video runs up to the 15-minute mark; the material lays the groundwork for learning SQL, BI concepts, and data modeling.