dbms

Database System Concepts - Chapter 3: Introduction to SQL

Overview of The SQL Query Language

  • SQL is a standardized query language for managing and manipulating databases.

Data Definition

  • In SQL, data definition includes defining schemas, attributes, and integrity constraints.

Basic Query Structure

  • SQL queries typically use the structure:
      - SELECT A1, A2, …, An FROM r1, r2, …, rm WHERE P

Additional Basic Operations

  • These include set operations, handling null values, and working with aggregate functions.

Set Operations

  • SQL allows operations like UNION, INTERSECT, and EXCEPT to manipulate sets of tuples.

Null Values

  • SQL can handle null values, representing unknown or nonexistent data.

Aggregate Functions

  • Functions like COUNT, SUM, AVG, MIN, and MAX operate on sets of values.

Nested Subqueries

  • SQL supports queries that are nested within other queries.

Modification of the Database

  • SQL allows for the modification of data using commands such as INSERT, UPDATE, DELETE, and DROP.


History of SQL

  • IBM Sequel language, later renamed SQL.

  • ANSI and ISO standards for SQL:
      - SQL-86
      - SQL-89
      - SQL-92
      - SQL:1999
      - SQL:2003

  • Commercial systems may not support all standards.

Data Definition Language (DDL)

  • SQL DDL permits the definition of schema for each relation, data types, integrity constraints, indices, and security information.

Domain Types in SQL

  • char(n): Fixed length character string with user-specified length n.

  • varchar(n): Variable length character string with user-specified maximum length n.

  • int: Integer, machine-dependent.

  • smallint: Small integer, machine-dependent.

  • numeric(p,d): Fixed point number with precision p and d decimal places. Example: numeric(3,1) allows 44.5, not 444.5 or 0.32.

  • real and double precision: Floating point numbers, machine-dependent.

  • float(n): Floating point number with user-specified precision of at least n digits.

Create Table Construct

  • Defined using CREATE TABLE r (A1 D1, A2 D2, ..., An Dn, (integrity-constraint1), ..., (integrity-constraintk)). Example:
    sql CREATE TABLE instructor (ID char(5), name varchar(20), dept_name varchar(20), salary numeric(8,2));   

Integrity Constraints in Create Table

  • Common constraints include:
      - NOT NULL
      - PRIMARY KEY (A1, …, An)
      - FOREIGN KEY (Am, …, An) REFERENCES r.

  • Example:
    sql CREATE TABLE instructor (ID char(5), name varchar(20) NOT NULL, dept_name varchar(20), salary numeric(8,2), PRIMARY KEY (ID), FOREIGN KEY (dept_name) REFERENCES department);   

Additional Relation Definitions

  • Example for student and takes tables:
    ```sql
    CREATE TABLE student (ID varchar(5), name varchar(20) NOT NULL, dept_name varchar(20), tot_cred numeric(3,0), PRIMARY KEY (ID), FOREIGN KEY (dept_name) REFERENCES department);

    CREATE TABLE takes (ID varchar(5), course_id varchar(8), sec_id varchar(8), semester varchar(6), year numeric(4,0), grade varchar(2), PRIMARY KEY (ID, course_id, sec_id, semester, year), FOREIGN KEY (ID) REFERENCES student, FOREIGN KEY (course_id, sec_id, semester, year) REFERENCES section);
      ```

Updates to Tables

  • Insert: INSERT INTO instructor VALUES ('10211', 'Smith', 'Biology', 66000);

  • Delete: DELETE FROM student;

  • Drop Table: DROP TABLE r;

  • Alter Table:
      - Add: ALTER TABLE r ADD A D;
      - Drop: ALTER TABLE r DROP A;

Basic Query Structure

  1. A common SQL query will be structured as follows:
       - SELECT A1, A2, ..., An FROM r1, r2, ..., rm WHERE P

  2. Where:
       - Ai: Attribute
       - Ri: Relation
       - P: Predicate

Select Clause

  • Lists desired attributes:

  • Example: to find instructor names: SELECT name FROM instructor;

  • SQL identifiers are case insensitive:
       - Name ≡ NAME ≡ name

  • To eliminate duplicates, add DISTINCT:
       - Example: SELECT DISTINCT dept_name FROM instructor;

Additional Select Clause Features

  • Use * to select all attributes: SELECT * FROM instructor;

  • Literal without FROM clause returns a single row: SELECT '437';

  • Arithmetic operations allowed:
       - SELECT ID, name, salary / 12 FROM instructor;

  • Alias using AS:
       - SELECT salary / 12 AS monthly_salary;

Where Clause

  • Specifies conditions for result:
       - Corresponds to selection predicate of relational algebra.

  • Example: find instructors in computer science:
       - SELECT name FROM instructor WHERE dept_name = 'Comp. Sci.';

  • Comparisons can combine with logical operators:
       - SELECT name FROM instructor WHERE dept_name = 'Comp. Sci.' AND salary > 80000;

From Clause

  • Lists relations involved in the query:
       - Corresponds to Cartesian product operation.

  • Cartesian product of relations will generate possible pairs:
       - Example: SELECT * FROM instructor, teaches;

  • Common attributes with renaming in resultant table:
       - Example: attributes named instructor.ID if both tables have ID.

Cartesian Product Example

  • To illustrate Cartesian product, consider:
       - Relations: instructor and teaches.
       - Example outputs would include pairs of instructors and their course teachings.

Examples of Queries

  • Find the names of instructors who teach:
    sql SELECT name, course_id FROM instructor, teaches WHERE instructor.ID = teaches.ID;    

  • Find instructors in Art department:
    sql SELECT name, course_id FROM instructor, teaches WHERE instructor.ID = teaches.ID AND instructor.dept_name = 'Art';    

The Rename Operation

  • SQL allows renaming relations and attributes using AS:
       - Example: SELECT DISTINCT instructor AS T, instructor AS S WHERE T.salary > S.salary AND S.dept_name = 'Comp. Sci.';

  • AS keyword can be omitted.

String Operations

  1. String comparisons performed using LIKE operator:
       - Example: SELECT name FROM instructor WHERE name LIKE '%dar%';
       - % matches any substring, _ matches a single character.

  2. SQL supports other string functions and operations including:
       - Concatenation, case conversion, finding string length, and extracting substrings.

Ordering the Display of Tuples

  • Results can be ordered alphabetically:
       - Example: SELECT DISTINCT name FROM instructor ORDER BY name;

  • Specify order as ASC for ascending or DESC for descending. Example: ORDER BY name DESC;

  • Multiple attributes can be specified for ordering.
       - Example: ORDER BY dept_name, name;

Where Clause Predicates

  • BETWEEN operator for ranging queries:
       - Example: SELECT name FROM instructor WHERE salary BETWEEN 90000 AND 100000;

  • Tuple comparison example:
    sql SELECT name, course_id FROM instructor, teaches WHERE (instructor.ID, dept_name) = (teaches.ID, 'Biology');    

Duplicates

  • SQL refers to how many copies of tuples appear in the result:
       - Queries utilize multisets for tuples.
       - If r1 has c1 copies of t1 and that satisfies selections, sq(r1) returns c1 copies of t1.

Set Operations

  • Examples of set operations include:
       - Courses in Fall 2009:
    sql SELECT course_id FROM section WHERE sem = 'Fall' AND year = 2009;    
       - Courses in Spring 2010 not in Fall 2009:
    sql SELECT course_id FROM section WHERE sem = 'Fall' AND year = 2009 UNION SELECT course_id FROM section WHERE sem = 'Spring' AND year = 2010;    

  • INTERSECT and EXCEPT can also be utilized.

Null Values

  • Null indicates an unknown value.

  • Result of operations with null returns null:
       - Example: 5 + null returns null.

  • To check null, use: SELECT name FROM instructor WHERE salary IS NULL;

Null Values and Three-Valued Logic

  • SQL uses three values: true, false, and unknown.
       - Any comparison with null returns unknown.
       - Logical operations involving unknown follow three-valued logic rules.

Aggregate Functions

  • Aggregate functions deal with multiset values to return summary statistics:
       - AVG: Average value.
       - MIN: Minimum value.
       - MAX: Maximum value.
       - SUM: Sum of values.
       - COUNT: Number of values.

  • Examples of usage:
    sql SELECT AVG(salary) FROM instructor WHERE dept_name = 'Comp. Sci.';    

Aggregate Functions - Group By

  • Example to find average salary in each department:
    sql SELECT dept_name, AVG(salary) AS avg_salary FROM instructor GROUP BY dept_name;    

  • Attributes outside aggregate in SELECT must be in GROUP BY.

Aggregate Functions – Having Clause

  1. HAVING filters results after grouping:
       - Example: SELECT dept_name, AVG(salary) FROM instructor GROUP BY dept_name HAVING AVG(salary) > 42000;

Null Values and Aggregates

  1. SUM ignores nulls; returns null if no non-null values.
       - Example: If all salaries are null, SUM(salary) is null and COUNT(salary) is 0.

Nested Subqueries

  1. A subquery is a select-from-where expression nested within another query:
       - Example: SELECT A1, A2, … FROM r1, r2, … WHERE P;

  2. Nested structure can replace Ai or ri.
       - Can also be used in predicates.

Set Membership

  1. Example of finding courses offered in Fall and not in Spring:
    sql SELECT distinct course_id FROM section WHERE semester = 'Fall' AND year = 2009 AND course_id NOT IN (SELECT course_id FROM section WHERE semester = 'Spring' AND year = 2010);    

Set Comparison - “some” Clause

  • Example:
    sql SELECT name FROM instructor WHERE salary > SOME (SELECT salary FROM instructor WHERE dept_name = 'Biology');    

“all” Clause

  • Find instructors whose salary is greater than all in Biology:
    sql SELECT name FROM instructor WHERE salary > ALL (SELECT salary FROM instructor WHERE dept_name = 'Biology');    

Test for Empty Relations

  • EXISTS returns true if subquery is non-empty:
    sql EXISTS r NOT EXISTS r;    

Use of “exists” Clause

  • Find courses taught in both Fall and Spring:
    sql SELECT course_id FROM section AS S WHERE semester = 'Fall' AND year = 2009 AND EXISTS (SELECT * FROM section AS T WHERE semester = 'Spring' AND year = 2010 AND S.course_id = T.course_id);    

Use of “not exists” Clause

  • Find students who have taken all Biology courses:
    sql SELECT DISTINCT S.ID, S.name FROM student AS S WHERE NOT EXISTS (SELECT course_id FROM course WHERE dept_name = 'Biology') EXCEPT (SELECT T.course_id FROM takes AS T WHERE S.ID = T.ID);    

Test for Absence of Duplicate Tuples

  • Use UNIQUE to test for duplicates in results:
    sql SELECT T.course_id FROM course AS T WHERE UNIQUE (SELECT R.course_id FROM section AS R WHERE T.course_id = R.course_id AND R.year = 2009);    

Subqueries in the From Clause

  1. SQL supports subqueries in the FROM clause to create temporary relations:
    sql SELECT dept_name, avg_salary FROM (SELECT dept_name, AVG(salary) AS avg_salary FROM instructor GROUP BY dept_name) WHERE avg_salary > 42000;    

With Clause

  1. WITH allows temporary relations:
    sql WITH max_budget (value) AS (SELECT MAX(budget) FROM department) SELECT department.name FROM department, max_budget WHERE department.budget = max_budget.value;    

  2. Can be used for complex queries with multiple layers of logic.

Subqueries in the Select Clause

  • Scalar subquery returns a single value:
    sql SELECT dept_name, (SELECT COUNT(*) FROM instructor WHERE department.dept_name = instructor.dept_name) AS num_instructors FROM department;    
       - Will result in runtime errors if subquery returns multiple results.

Modification of the Database

  • Involves:
      - Deletion of tuples
      - Insertion of new tuples
      - Updating values in tuples

Deletion

  1. Delete all instructors:
    sql DELETE FROM instructor;    

  2. Conditional deletion based on attributes:
    sql DELETE FROM instructor WHERE dept_name = 'Finance';    

Insertion

  1. Inserting into a table can be done as follows:
    sql INSERT INTO course VALUES ('CS-437', 'Database Systems', 'Comp. Sci.', 4);    

  2. Using attributes explicitly:
    sql INSERT INTO course (course_id, title, dept_name, credits) VALUES ('CS-437', 'Database Systems', 'Comp. Sci.', 4);    

Updates

  1. Updating salaries conditionally:
       - Raise by 3% if over $100,000, otherwise by 5%.
    sql UPDATE instructor SET salary = salary * CASE WHEN salary <= 100000 THEN 1.05 ELSE 1.03 END; &nbsp;&nbsp;&nbsp;

  2. Use of CASE for conditional updates:
    sql UPDATE instructor SET tot_cred = (SELECT SUM(credits) FROM takes, course WHERE takes.course_id = course.course_id AND S.ID = takes.ID AND takes.grade <> 'F' AND takes.grade IS NOT NULL); &nbsp;&nbsp;&nbsp;


End of Chapter 3

  • Notes summarized from Database System Concepts, 6th Edition by Silberschatz, Korth, and Sudarshan.