Database Concepts and SQL

Databases

This note covers why databases are essential for storing and manipulating data, addressing the limitations of file systems, and introducing key database concepts.

Why Store Data in Files?

Storing data in files is a fundamental approach, but it presents several challenges, especially when dealing with large datasets or complex relationships.

Issues with File Systems for Data Storage

  1. Data Storage: Data isn't inherently stored on disk in a structured manner.
  2. Logical vs. Physical Structure: Programmers must define both the logical data structure and the physical storage structure, including access methods and I/O modes.
  3. Data Redundancy: High data redundancy occurs because each program often has its own dataset. Data redundancy is the practice of keeping the same data in multiple locations within a storage system.
  4. Lack of Persistence: Data lacks persistence without explicit handling.

File-Based Solution

Files can store data with interfaces between programs, but this requires extensive programming and managing various access methods like sequential, indexed, and random.

Issues with File-Based Solutions

  1. Separation and Isolation: Each program maintains its own data, which can lead to unawareness or blocking by other programs.
  2. Duplication: Data is duplicated across different programs, wasting space and resources.
  3. High Maintenance Costs: Maintaining data consistency and controlling access results in high costs.
  4. Weak Security: File systems often offer inadequate security measures.

Database as the Ultimate Solution

Databases overcome the limitations of file systems by providing a structured and efficient way to store and manage data.

What is a Database?

A database is a structured collection of information with the following key features:

  • Proper Storage: Efficiently stores large amounts of data.
  • Easy and Fast Access: Provides quick and easy data retrieval.
  • Facilitates Data Processing: Supports effective data manipulation.

Why Use a Database?

  1. Efficient Storage: Databases store large numbers of records efficiently, minimizing space usage.
  2. Fast Information Retrieval: Quickly and easily find information.
  3. Easy Data Management: Easily add, edit, or delete data.
  4. Advanced Search Capabilities: Data can be easily searched, e.g., 'find all Ford cars'.
  5. Sorting: Data can be sorted easily, for example, by 'date first registered'.
  6. Data Import: Data can be imported into other applications, such as mail merge.
  7. Multi-Access: Multiple users can access the database simultaneously.
  8. Enhanced Security: Databases offer better security compared to files.

Database Management Systems (DBMS)

A Database Management System (DBMS) is a set of software tools used to define, store, manipulate, and control data within a database.

Advantages of DBMS

  1. Efficient Application Programming: Provides an efficient way to program complex applications without the need to rewrite data access and retrieval functions for each application.
  2. Data Sharing: Offers a simple, standard way to share data among multiple applications and users.
  3. Specialized User-Oriented Languages: Includes specialized languages for users to interact with the database.
  4. Standard Interfaces: Provides standard interfaces for data communication programs, enabling efficient construction, testing, and maintenance of online transaction processing applications.

Database Model: Two-Dimensional Tables

According to Edgar F. Codd, a database is conceptually a collection of two-dimensional tables.

Key Database Terms


  • Attribute: A column in a table.

  • Entity: A real-world object or concept represented in a database.

  • Tuple/Record: A row in a table representing a single instance of an entity.

    *Example Table:

    EMPLOYEES

IDLAST NAMEFIRSTNAMEBIRTHDATEPAY RATEHOURSWORKED
116KayJanet3/29/1956$16.6094
123PerreiraFrancine8/15/1987$8.50185
149TakasanoFrederick5/23/1966$12.35250
171KayJohn11/17/1954$17.80245
165HonouMorris6/9/1988$6.7053

Primary and Foreign Keys

  • Primary Key: A uniquely identifying attribute for each record in a table.
  • Foreign Key: A column in one table that references a column in another table, establishing a link between the tables.

Accessing Data in a Database with SQL

Structured Query Language (SQL) is a special-purpose programming language designed for managing data held in a relational database management system (RDBMS).

SQL Example 1
SELECT ID, LastName, FirstName
FROM Employees
WHERE ID = 123;

This query selects the ID, LastName, and FirstName from the Employees table for the employee with an ID of 123.

SQL Example 2
SELECT FirstName, BirthDate
FROM Employees
WHERE LastName = "Kay";

This query selects the FirstName and BirthDate from the Employees table for all employees with the last name "Kay".

SQL Example 3
SELECT FirstName, BirthDate
FROM Employees
WHERE LastName = "Kay"
AND PayRate = "$17.80";

This query selects the FirstName and BirthDate from the Employees table for all employees with the last name "Kay" and a PayRate of $17.80.

Sorting Data

The ORDER BY keyword is used to sort records in ascending order.

SELECT *
FROM Employees
ORDER BY ID;

Operations on Database Records

  1. Insert: Add new records to a table.

    Insert into Table_Name values();
    
  2. Delete: Remove records from a table.

    Delete from Customers where age=31;
    
  3. Update: Modify existing records in a table.

    Update Customers set first_name="EMan" where first_name="John";
    

Query Writing

  1. Create Database:

    Create database database_Name;
    
  2. Use Database:

    -- No specific command shown in the reference.
    
  3. Create Table:

    create table Faculty (
        id int,
        name varchar(20)
    );
    
  4. Primary Key:

    create table Faculty (
        id int primary key,
        name varchar(20)
    );
    
  5. Foreign Key:

    create table Student (
        stu_id int,
        name varchar(20),
        Faculty_id int,
        FOREIGN KEY (Faculty_id)
        REFERENCES Faculty (id)
    );
    
  6. Alter Table:

    Alter table Table_Name
    add GPA float;
    
  7. Drop Table:

    Drop table Table_Name;
    

Data Definition Language (DDL)

DDL commands are used to define the structure of the database.

  • CREATE TABLE: Creates a new table.

    CREATE TABLE students (
        id INT PRIMARY KEY,
        name VARCHAR(50),
        age INT
    );
    
  • ALTER TABLE: Modifies an existing table.

    ALTER TABLE students ADD grade VARCHAR(2);
    
  • DROP TABLE: Deletes a table completely.

    DROP TABLE students;
    

Data Manipulation Language (DML)

DML commands are used to manipulate the data within the database.

  • INSERT INTO: Adds new data to a table.

    INSERT INTO students (id, name, age) VALUES (1, 'Alice', 20);
    
  • SELECT: Retrieves data from a table.

    SELECT * FROM students;
    
  • UPDATE: Modifies existing data in a table.

    UPDATE students SET age = 21 WHERE id = 1;
    
  • DELETE FROM: Deletes data from a table.

    DELETE FROM students WHERE id = 1;