Chapter 2: SQL Overview — Key Concepts and Terms
Quick Course Orientation
- Chapter 2 = SQL overview focused on reporting data from a database.
- Not a full SQL command guide; emphasis on ad hoc queries for retrieving data.
Core SQL Concepts
- Ad hoc query: type it on the command line (as opposed to a canned/query saved for you).
- Canned query: prebuilt query; not the focus for learning here.
- Single-table retrieval is straightforward; grouping data (e.g., average, sum) is important.
- Use of comparison and logical operators (e.g., first name = 'Bob') is essential.
- Retrieving data from multiple tables is not covered in this video.
Learning Approach and Resources
- GUI-based query tools exist but are not emphasized for learning.
- Query by Example is neat but not ideal for building foundational SQL skills.
- Consider additional resources (e.g., SQL for Dummies) for broader practice.
OLTP, BI, and Data Warehousing: Big Picture
- Business intelligence (BI) deals with turning lots of data into insights.
- OLTP (Online Transaction Processing): live transactional systems (multiple stores, online orders) feeding a single database.
- BI and data warehousing distinguish operational data from historical analysis.
- ETL = Extract, Transform, Load: thinning data to reduce volume before loading into BI systems.
- BI users are typically analysts or executives performing what-if analyses.
Data Architecture: ETL, Warehouses, and Data Marts
- Operational database plus internal/external data sources feed BI.
- Data warehouse: separate machine tuned for read-heavy analytics; often not on the live system.
- Historical data is typical in data warehouses; often read-only after extraction.
- Data mart: a small, focused data warehouse for a specific need (e.g., location-based analysis).
Cape Cod Outdoor Sports: OLTP Example and Tables
- OLTP system handles retail orders, multiple order line items per order.
- Key tables: Order (header), Order Item (line items), SKU data (inventory).
- Orphan tables: historical catalogs (e.g., 2020, 2021) not live, not linked to current data.
Relational Notation (Text-Based)
- Notation concept reinforced on page 52-55: table notation uses table name and a list of columns in parentheses.
- Primary key (PK) is indicated by underlining in the notation.
- Foreign keys (FK) are italicized; a column can be both PK and FK.
- Mouthful note: the side of a foreign key that references the other table doesn’t have a named “receptor” in the notation.
Schema: What it Is and How it Differs
- Schema = collection of tables and their relational notations.
- Different from Oracle’s schema, which is more like a user profile; here it’s about the structure of the data (tables and relationships).
- A schema helps describe how tables connect and where primary keys live.
Relational Design, Normalization, and Joining
- Relational design favors splitting data into multiple related tables (normalization) rather than one big table.
- Normalization: process of dividing a table into smaller, related tables to reduce redundancy.
- Joining: how you combine related tables back together to reconstruct the original view.
- This will be covered in more depth in Chapter 3.
SQL Standards and History
- SQL standardization is driven by committees; debates over proper syntax for clauses like FROM and SELECT.
- Historically, US ANSI character set; ISO standard aimed for global compatibility.
- After 1996, many vendors moved to self-certification for compliance, making conformance harder to verify.
- To manage changes, the standard was broken into smaller chunks for faster releases.
Quick Wrap-Up
- The video runs up to the 15-minute mark; the material lays the groundwork for learning SQL, BI concepts, and data modeling.