Database Systems: Lecture 1

Lecturer: Dr. Salma Abdullah
Outline of the Lecture:

  • Database Introduction

  • An Example

  • Characteristics of the Database

  • Actors on the Scene

  • Advantages of using the DBMS approach

Database Introduction

Definition of Database

A database is a collection of related data.

  • Data: Known facts that can be recorded and have an implicit meaning.

  • Database Management System (DBMS): A software package/system designed to facilitate the defining, constructing, manipulating, and sharing functions of a computerized database.

Database Examples

Databases are ubiquitous in modern society. They are utilized in various sectors and for numerous purposes:

  • Banking: Involves the processes of depositing and withdrawing money.

  • Travel Industry: Used for making hotel and airline reservations.

  • E-commerce: Allows purchasing goods online.

  • Grocery Stores: Used for tracking purchases and managing inventories.

Database Applications

Different types of database applications exist, categorized into traditional and more recent applications:

Traditional Database Applications

  • Example applications include simple systems used in banks, retail, and educational institutions.

Recent Database Applications

  • YouTube

  • iTunes

  • Geographic Information Systems (GIS)

  • Data Warehouses

  • Various other applications expanding the scope of databases.

Database Size and Complexity

Databases can vary widely in size and complexity.

  • Simple Database: A basic list of names and addresses.

  • Complex Database Example:
      - IRS (Internal Revenue Service) as a vast database managing over 100 million taxpayers, where each taxpayer may file approximately 5 forms with 400 characters of information per form. This results in about 800 gigabytes of data.

  • E-commerce Example:
      - Amazon.com processes data for around 15 million daily visitors, with approximately 100 staff responsible for updating the database.

Typical DBMS Functionality

A DBMS performs several key functions:

  1. Database Definition: Defines the database in terms of its data types, structures, and constraints.

  2. Database Construction: Loads the initial database contents on a secondary storage medium.

  3. Data Manipulation:
       - Retrieval: Querying and generating reports.
       - Modification: Includes insertions, deletions, and updates to the database's contents.
       - Accessing the Database: Through web applications.

  4. Sharing Functions: Allows multiple users and programs to access the database simultaneously.

Database System Structure

A database system consists of both the DBMS and the database it manages.

  • Representation of database system:
        Database System = DBMS + Database

Simplified Database System Environment

The simplified structure can be illustrated:

  • Users/Programmers

  • Database System (DBMS, Software)

  • Application Programs/Queries

  • Connections among the above elements facilitate the processing of queries and access to stored data.

University Database Example

A practical example of a database within a university setting:

  • STUDENT file: Stores data on each student.

  • COURSE file: Contains details on each course offered.

  • SECTION file: Maintains records of each section for all courses.

  • GRADE_REPORT file: Documents the grades students receive.

  • PREREQUISITE file: Lists prerequisites for each course.

Example of University Database Structure

STUDENT Table

Name

Student number

Class

Major

Smith

17

1

CS

Brown

8

2

CS

COURSE Table

Course Name

Course Number

Credit Hours

Department

Intro to Computer Science

CS1310

4

CS

Data Structures

CS3320

4

CS

Discrete Mathematics

MATH2410

3

MATH

Database

CS3380

3

CS

SECTION Table

Section Identifier

Course Number

Semester

Year

Instructor

85

MATH2410

Fall

04

King

92

CS1310

Fall

04

Anderson

102

CS3320

Spring

05

Knuth

GRADE REPORT Table

Student Number

Section Identifier

Grade

17

112

BC

17

119

B

85

102

B

8

135

A

PREREQUISITE Table

Course Number

Prerequisite Number

CS3380

CS3320

CS3380

MATH2410

CS3320

CS1310

Database Manipulation

Database manipulation comprises querying and updating as primary activities.

  • Querying Example: For instance, listing the prerequisites for the “Database” course.

  • Updating Example: Entering a grade of “A” for “Smith” in the “Database” course.

Database vs File Management

Key Differences Between Database and File Approaches:

  1. Self-describing Nature of a Database: Contains metadata about its own structure.

  2. Insulation Between Programs and Data: Changes in data structure do not necessitate program changes.

  3. Support for Multiple Views of Data: Different users can have tailored views.

  4. Data Sharing & Multi-user Transaction Processing: Enables concurrent access and updates to database content.

Self-describing Nature of Database

A database maintains metadata defining its structure and constraints, allowing for self-description.

  • Metadata: Data describing the structure of the primary database, held in the system’s catalog.

Insulation Between Programs and Data

In file processing systems, structural changes require modification of all associated programs.

  • Program-data Independence: In database systems, the structure is maintained separately from programs, which enhances flexibility.

Support of Multiple Views

This allows each user to see a customized view of the database, tailored to their data needs, and sometimes includes derived data as well.

Sharing of Data and Multi-user Transaction Processing

Databases support concurrent access from multiple users, with concurrency control ensuring each transaction is accurately executed or aborted.

  • A practical example includes multiple reservation clerks working on seat assignments for flights, demonstrating the online transaction processing (OLTP) applications.

Actors in the Database Environment

Database Users

  1. Database Administrators:
       - Responsible for granting access, monitoring usage, managing resources, and ensuring operational efficiency.

  2. Database Designers:
       - Define database content, structure, constraints, and transactions. Involve communication with end-users to gather requirements.

  3. End Users: Can be classified as follows:
       - Casual: Use databases occasionally with simple queries (e.g., managers).
       - Naïve: Frequent users with basic query skills (e.g., bank clerks).
       - Sophisticated: Advanced users with in-depth knowledge of the database (e.g., analysts, engineers).
       - Stand-alone: Maintain personal databases via GUI applications (e.g., TurboTax users).

Advantages of Using a DBMS

Controlling Redundancy

Controlling redundancy is crucial in database systems as opposed to traditional file systems where data is often duplicated.

  • Problems Resulting from Redundancy:
      1. Logic Update Issues: Requires multiple updates leading to potential inconsistencies.
      2. Wasted Storage Space: Inefficient use of storage resources.
      3. Inconsistencies Among Files: Duplicated data may reflect conflicting information.

Other Advantages of the DBMS Approach

  1. Restricts unauthorized access to data.

  2. Provides storage structures (indexes) for efficient query processing.

  3. Offers backup and recovery services.

  4. Supports various interfaces for different user classifications.

  5. Represents complex relationships among data effectively.

Disadvantages of Using a DBMS

When NOT to Use a DBMS

  1. Cost Implications: High initial investments in hardware, software, and training.

  2. Data Processing Costs: Expenses associated with defining and processing data can be substantial.

  3. Performance Overhead: Overhead linked to security, concurrency control, recovery, and integrity management may hinder performance.

Situations Favoring Regular Files

  1. Simplicity of the application that doesn’t anticipate future changes.

  2. Real-time requirements that may not align with the DBMS overhead.

  3. Applications without needs for multiple user access to the data.