Landscape: The slide displays a variety of database technologies and vendors.
Structured DB: Oracle, MySQL, PostgreSQL, etc.
Infrastructure Operational.
As A Service.
Technologies: HBASE, Cassandra, Hadoop, etc.
New technology maintains the same SQL Principles.
SQL Standards
Vendor Conformance: Major database vendors (Oracle, IBM, Microsoft, Sybase) generally conform to the SQL standard.
Proprietary Extensions: Database companies often add proprietary extensions, leading to different SQL dialects.
Incompatibilities:
Exist between systems.
Can arise with newer standards (e.g., triggers in SQL:1999).
Focus: The course concentrates on the principles of SQL, primarily using SQL92 as a basic subset.
Learning SQL: Good Practices
Hands-on Practice:
Install a DBMS (e.g., PostgreSQL, MySQL) on your machine.
Set up a database and tables with sample data.
Run SQL queries and debug.
Resources:
SQL server in the school lab.
Google errors (Stack Overflow).
Consult textbooks.
Online resources like w3schools.com.
Comparison of SQL implementations by Troels Arvin.
SQL Operations
Actions on Relations:
Retrieve data.
Insert data.
Delete data.
Update data.
Purpose: SQL is used to manage and query databases.
Declarative Language
SQL is Declarative: SQL is a declarative (non-procedural) language.
Specifies what to retrieve, not how to retrieve it.
Procedural Language Example:
C, C++, Java, where you write step-by-step instructions.
SQL Limitations:
SQL is not a complete programming language.
It lacks control or iteration commands.
SQL Capabilities
Queries: SQL is used to perform queries.
Updates: SQL is used to add, delete, and modify data.
Data Definition Language (DDL):
Creates databases, tables, and indices.
Creates views.
Specifies integrity constraints.
Embedded SQL:
Wrap a high-level programming language around DML to perform more advanced queries/updates.
Data Manipulation Language (DML):
Tables in SQL
Relation/Table: A multiset of tuples (rows) with attributes defined by the schema.
Schema: Contains the name of the relation and the set of attributes.
Example:
Table Name: Product
Attributes: PName, Price, Category, Manufacturer
Multiset:
An unordered list that allows duplicate instances.
List: [1, 1, 2, 3]
Set: {1, 2, 3}
Multiset: {1, 1, 2, 3}
Attributes (Columns) in a Table
Definition: An attribute (or column) is a typed data entry in each tuple of the relation.
Example from Product Table:
PName, Price, Category, Manufacturer are attributes.
Each tuple contains values for these attributes.
Tuples (Rows) in a Table
Definition: A tuple (or row) is a single entry in the table with the attributes specified by the schema.
Alternate Name: Also referred to as a record.
Example from Product Table:
(iPhone x, 888, Phone, Apple) is a tuple.
Data Types in SQL
Character Strings:
CHAR(20)
VARCHAR(50)
Numbers:
INT
FLOAT
Others:
BOOLEAN
DATETIME
Every attribute must have a type.
Key of a Table
Definition: A key is an attribute (or set of attributes) whose values are unique.
Primary Key: A key that uniquely identifies each tuple in the table.
Underlined in the schema definition, e.g., Product(PName, Price, Category, Manufacturer).
Principle Form of SQL
Basic Structure:
SELECT desired_attributes (A1, A2, …, An)
FROM one_or_more_tables (R1, R2, …, Rm)
WHERE condition_about_tuples_of_the_tables (P)
Mapping to Relational Algebra:
\Pi A1, A2, …, An (\sigma P (R1 \times R2 \times … \times Rm))
Simple SQL Query
Product Table Example:
Shows sample data in the Product table.
SQL Query:
SELECT * FROM Product WHERE Category = ‘Phone’
Explanation:
Retrieve all attributes (*) from the Product table where the Category is 'Phone'.
This is a “selection” operation.
Simple SQL Query (cont.)
SQL Query:
SELECT * FROM Product WHERE Category <> ‘Phone’
Explanation:
The <> operator means "not equal to."
This query selects all rows where the category is not 'Phone'.
Simple SQL Query (AND)
SQL Query:
SELECT * FROM Product WHERE Category = ‘Phone’ AND Price > 800
Explanation:
Selects all attributes from the Product where the Category is 'Phone' AND the Price is greater than 800.
Simple SQL Query (OR)
SQL Query:
SELECT * FROM Product WHERE Category = ‘Tablet’ OR Price > 1000
Explanation:
Selects all attributes from the Product table where the Category is 'Tablet' OR the Price is greater than 1000.
Simple SQL Query (Selection and Projection)
SQL Query:
SELECT PName, Price, Manufacturer FROM Product WHERE Price > 800
Explanation:
Selects the PName, Price, and Manufacturer attributes from the Product table where the Price is greater than 800.
Combines selection and projection operations.
Simple SQL Query (WHERE Clause: BETWEEN)
SQL Query:
SELECT PName, Price, Manufacturer FROM Product WHERE Price BETWEEN 800 AND 1200
Explanation:
Selects the PName, Price, and Manufacturer from the Product table where the Price is between 800 and 1200 (inclusive).
SQL Query:
SELECT PName, Price, Manufacturer FROM Product WHERE Manufacturer IN (‘Huawei’, ‘Canon’)
Explanation:
Selects the PName, Price, and Manufacturer from the Product table where the Manufacturer is either 'Huawei' or 'Canon'.
SQL Syntax
Reserved Words:
A set of reserved words (e.g., SELECT, FROM, WHERE) cannot be used as names for tables or attributes.
Constants:
Use single quotes for constants (e.g., 'abc' is okay, "abc" is not).
Case-Insensitivity:
SQL is generally case-insensitive.
Exception: string constants are case-sensitive (e.g., 'FRED' is not the same as 'fred').
Whitespace:
Whitespace is ignored.
Semicolon:
All statements end with a semicolon (;).
Eliminating Duplicates
SQL Query:
SELECT Category FROM Product
Output:
May contain duplicate categories.
Eliminating Duplicates (cont.)
SQL Query:
SELECT DISTINCT Category FROM Product
Explanation:
DISTINCT keyword eliminates duplicate values in the result set.
AS: Renaming Attributes
SQL Query:
SELECT PName AS Product, Price AS Cost, Manufacturer FROM Product WHERE Category = ‘Phone’
Explanation:
AS keyword renames attributes in the output.
PName is renamed to Product, and Price is renamed to Cost.
Expressions in SELECT Clause
SQL Query:
SELECT PName, Price*1.4 AS Cost_IN_SGD, Manufacturer FROM Product WHERE Category = ‘Phone’
Explanation:
Calculates a new attribute Cost_IN_SGD by multiplying Price by 1.4.
SQL Queries:
SELECT * FROM Product WHERE PName LIKE ‘iPh%’
SELECT * FROM Product WHERE PName LIKE ‘%Phone x%’
SELECT * FROM Product WHERE PName LIKE ‘%P%e%’
SELECT * FROM Product WHERE PName LIKE ‘ _Phone x’
SELECT * FROM Product WHERE PName LIKE ‘ Phone_ ’
SELECT * FROM Product WHERE PName NOT LIKE ‘ Phone_ ’
Explanation:
% stands for "any string"
_ stands for "any character"
Patterns for Strings (cont.)
'_ _ _' – Matches any string of exactly three characters
'_ _ _%' – Matches any string of at least three characters
'ab\%cd%' – Match all strings beginning with "ab%cd"
Ordering the Results (cont.)
SQL Query:
SELECT PName, Price FROM Product WHERE Price < 800 ORDER BY PName DESC
Explanation:
Orders the result set by PName in descending order.
Ordering the Results (cont.)
SQL Query:
SELECT PName, Category FROM Product WHERE Price < 1000 ORDER BY Category, PName
Explanation:
Orders the result set first by Category (ascending by default) and then by PName (ascending by default).
Ordering the Results (cont.)
SQL Query:
SELECT PName, Category FROM Product WHERE Price < 1000 ORDER BY Category DESC, PName
Explanation:
Orders the result set first by Category in descending order and then by PName in ascending order.
Ordering the Results (cont.)
SQL Query:
SELECT PName, Category FROM Product WHERE Price < 1000 ORDER BY Category DESC, PName DESC
Explanation:
Orders the result set first by Category in descending order and then by PName in descending order.
Exercise
What is the output of the following query?
SELECT DISTINCT Category FROM Product ORDER BY Category
Exercise (cont.)
Corrected Query and Explanation:
The correct query should extract distinct categories and order them.
The final correct query:
SELECT DISTINCT Category FROM Product ORDER BY Category
Exercise
Why the following query is incorrect?
SELECT DISTINCT Category FROM Product ORDER BY Category WHERE Price < 1000
Exercise (cont.)
Explanation:
"WHERE" should always precede “ORDER BY”.
Exercise (cont.)
The following query is incorrect:
SELECT DISTINCT Category FROM Product ORDER BY PName
Explanation:
“ORDER BY” items must appear in the select list if “SELECT DISTINCT” is specified.
Joins
Scenario:
Given two tables, Product and Company.
Problem:
A user wants to know the names and prices of all products by Japan companies.
SQL Query:
SELECT PName, Price FROM Product, Company WHERE Country = ‘Japan’ AND Manufacturer = CName
Meaning (Semantics) of Join – An Example
SQL Query:
SELECT R.A FROM R, S WHERE R.A = S.B
Steps:
Take cross product: X = R \times S
Apply selections / conditions:
Apply projections to get the final output.
Filtering! => Returning only some attributes
How Join is Actually Executed
The preceding slides show what a join means (i.e., semantics).
Not actually how the DBMS executes it under the covers.
Joins
Find the names of the persons who work for companies in USA
SELECT PName FROM Person, Company WHERE Country = 'USA' AND WorksFor = CName
Joins
Find the names of the persons who work for companies in USA, as well as their company addresses
SELECT PName, Company.Address FROM Person, Company WHERE Country = 'USA' AND WorksFor = CName
Joins
Find the names of the persons who work for companies in USA, as well as their company addresses
SELECT X.PName, Y.Address FROM Person AS X, Company AS Y WHERE Y.Country = 'USA' AND X.CName = Y.CName
Exercise
Find the names of the companies in China that produce products in the ‘tablet’ category
SELECT DISTINCT CName FROM Company, Product WHERE Manufacturer = CName AND Country = ‘China’ AND Category = ‘Tablet’
Exercise
Find the names of the companies in China that produce products in the ‘tablet’ or ‘phone’ category
SELECT DISTINCT CName FROM Company, Product WHERE Manufacturer = CName AND Country = ‘China’ AND (Category = ‘Tablet’ OR Category = ‘Phone’)
Exercise
Find the manufacturers that produce products in both the ‘tablet’ and ‘phone’ categories
SELECT DISTINCT X.Manufacturer FROM Product AS X, Product AS Y WHERE X.Manufacturer = Y.Manufacturer AND X.Category = ‘Tablet’ AND Y.Category = ‘Phone’$$