Database Concepts and SQL Fundamentals

Data, Information, Database, and DBMS

Data, Information, Database, and DBMS

  • Data: A set of facts, observations, or measurements expressed in quantities, symbols, or numbers.

    • Examples: Names, addresses, telephone numbers (e.g., Ali, Cairo, 150, 010024586, 75).

  • Information: Processed, organized, or summarized data that is meaningful and useful for decision-making. Data are processed to create information.

    • Example: A table showing Name, Address, Telephone number, and Weight; Ali, Cairo, 010024586, 75

  • Database: A huge collection of data handled by specific software, or a collection of logically related data stored together to meet the information requirements of an organization.

  • Database Management: The ability to organize, store, and retrieve data from a computer, including data storage, operations, and security aspects.

  • Database Management System (DBMS): A program or group of programs that work with the operating system to create, process, store, retrieve, control, and manage data. It acts as an interface between application programs and the database.

    • Examples:

      • Microsoft Access

      • MySQL

      • Oracle

      • Microsoft SQL Server

      • SQLite

Components of DBMS

A DBMS has three main components:

  • Data Definition Language (DDL)

  • Data Manipulation Language (DML) and Query Facilities (SQL)

  • Software for Controlled Access of Database (Data Control Language - DCL)

Data Definition Language (DDL)

  • Allows users to define the database, specify data types, data structures, and constraints on the data to be stored.

Data Manipulation Language (DML) and Query Language

  • DML allows users to insert, update, delete, and retrieve data from the database.

Software for Controlled Access of Database

  • Provides controlled access to the database, concurrency control for shared access, and a recovery control system to restore the database in case of failure.

  • Also known as Data Control Language (DCL).

Meta Data and Data Dictionary

  • Meta Data: Data about the data, describing the database structure, data type sizes, constraints, applications, authorizations, etc.

  • Data Dictionary: Contains information about the data stored in the database and is consulted by the DBMS before any manipulation operation.

    • Types of Data Dictionaries:

      • Active Data Dictionary: Managed automatically by the DBMS and always consistent with the current structure and definition of the database.

      • Passive Data Dictionary: Used for documentation purposes, maintaining data about fields, files, and people for cross-references.

Database System Environment

  • A database system consists of the DBMS software together with the database.

  • Composed of database, software, hardware, and people.

File-Based System (Before DBMS)

  • Before database systems, information was stored in files.

  • A company system had several application programs, each designed to manipulate data files.

Disadvantages of Traditional File System

  1. Lack of Data Integration: Difficulty in getting information on ad hoc queries that require accessing data from multiple files.

  2. Data Redundancy: The same data may be recorded and stored in many files.

    • Example: Employee name and designation in both personal and payroll files.

  3. Data Inconsistency: Data redundancy leads to inconsistency when data is updated.

    • Occurs when the same data items in multiple files are not updated simultaneously.

  4. Inadequate Data Manipulation Capability: Limited data manipulation capabilities due to the lack of strong relationships between data in different files.

  5. Limited Data Sharing: Each application has its own private files, limiting data sharing outside of the application.

    • Complex programs are required to obtain data from several incompatible files.

  6. Poor Data Control: No centralized control at the data element level, leading to decentralization.

    • Data fields may have multiple names defined by different departments, leading to different meanings in different contexts.

Database Organization

  • Databases are organized by fields, records, and files.

  • Fields: The smallest unit of data that has meaning to users, also called data item or data element.

    • Examples: Name, Address, Telephone number.

  • Records: A collection of logically related fields, each with a fixed number of bytes and a fixed data type.

  • Files: A collection of related records.

Components of a Database

A database consists of four components:

  • Data Item: A distinct piece of information.

  • Relationships: Correspondence between various data elements.

  • Constraints: Predicates that define correct database states.

  • Schema: Describes the organization of data and relationships within the database.

    • Consists of definitions of record types, data items, and sets.

    • The storage structure is described by the storage schema.

    • The conceptual schema defines the stored data structure.

    • The external schema defines a view of the database for users.

Database Development Life Cycle

  • Database design is part of system development, especially in business-oriented systems.

DBMS Users

The users of a database system are classified based on their interaction and expertise with the DBMS:

  • End Users or Naive Users: Use the database through menu-oriented application programs without needing to know about the database system.

    • Example: ATM users.

  • Application Programmers: Professional programmers or software developers who develop application programs or user interfaces.

    • Must have knowledge of programming languages like Assembly, C, C++, Java, or SQL.

  • System Analyst: Collects requirements and decides the conceptual schema (ER model) or contents of the database like data fields, tables, queries, data types, attributes, relations, and entities.

    • Responsible for overall logical design of the database.

  • Database Designer: Decides the internal schema or structure of physical storage.

    • DBA decides how the data is stored at physical storage and how data is represented.

  • Database Administrator (DBA): Has complete control over the database.

    • Responsible for overall performance, making decisions, and providing technical support.

    • Concerned with the back end of any project.

    • Responsibilities:

      1. Deciding users: Giving permission to users to use the database.

      2. Deciding user views: Defining different views for different users.

      3. Deciding constraints: Setting various constraints to maintain consistency and validity.

Levels or Views Architecture of Database Systems (DBMS)

The architecture is a framework for describing database concepts and specifying the structure of database system.

  1. Internal Level: Describes the actual physical storage of data or the way in which the actual data is stored in memory.

  2. Conceptual Level: Known as the logical level, describes the overall logical structure of the whole database for a community of users.

    • Data visible at this level will be relational tables, and operators will be relational operators.

  3. External Level: Concerned with individual users and describes the actual view of data seen by individual users.

    • The external schema is defined by the DBA for every user.

Data Models

  • Data modeling is the first step in the process of database design, considered a high-level and abstract design phase (conceptual design).

  • A data model is a collection of concepts used to describe the structure of the database, including data types, relationships, and constraints.

  • Helps in understanding the meaning of the data and ensuring understanding.

Types of Data Models

  • Hierarchical Data Model

  • Network Data Model

  • Relational Data Model

Hierarchical Data Model
  • One of the oldest database models.

  • Became popular with IBM’s Information Management System (IMS).

  • Organizes records in a tree structure (hierarchy of parent and child records relationships).

  • Employs two main concepts: Record and Parent-Child Relationship.

    • Record: A collection of field values that provide information of an entity.

    • Parent-Child Relationship: A 1:N relationship between two record types.

Network Data Model
  • Uses directed graphs instead of tree structure, allowing a node to have more than one parent.

  • Designed to handle non-hierarchical relationships.

  • Relationships between specific records of 1:1, 1:N, or N:N are explicitly defined in the database definition.

E-R Model
  • A high-level conceptual data model developed by Chen in 1976 to facilitate database design.

  • Generalization of earlier commercial models like hierarchical and network models.

  • Allows representation of various constraints and relationships.

  • Relationship between entity sets is represented by a name and is of type 1:1, 1:N, or N:N, indicating the mapping from one entity set to another.

Types of Database Systems

  • Classified into three categories:

    • According to the number of users

    • According to the type of data models used

    • According to database site locations

Categories According to the Number of Users

  1. Single-user database systems

  2. Multiuser database systems

Categories According to Database Site Locations

  1. Centralized database systems

  2. Client/Server database systems

  3. Distributed database systems

Database Applications Examples

  • Banking and finance

  • Universities: registration, grades

  • Airlines: reservations, schedules

Conceptual Database Design - Entity Relationship (ER) Modeling

  • A graphical technique for understanding and organizing data independent of the actual database implementation.

Key Terms

  • Entity and Entity Type

  • Attributes and Types of Attributes

  • Keys and Types of Keys

  • Relationships and Types of Relationships

Entity
  • Anything that has an independent existence and about which we collect data.

  • May be any object, class, person, or place.

    • Example: Manager, product, employee, department.

    • Types of entities:

      • Regular (Strong) Entity: An entity which has its own key attribute.

        • Represented as rectangles in the ER diagram.

        • Example: Employee.

      • Weak Entity: An entity which depends on another entity for its existence and doesn't contain any key attribute of its own.

        • Represented by a double rectangle.

Attributes
  • Properties or characteristics which describe entities.

  • Used to describe the property of an entity.

  • Represented by an ellipse (oval).

    • Examples: ID, age, contact number, name.

    • Types of Attributes

      1. Candidate Key: A simple or composite key that is unique and minimal.

        • Unique because no two rows in a table may have the same value at any time.

        • Minimal because every column is necessary to attain uniqueness.

      2. Primary Key: A candidate key selected by the database designer to be used as an identifying mechanism for the whole entity set.

        • Must uniquely identify tuples in a table and not be null.

        • Indicated in the ER model by underlining the attribute.

      3. Composite Attribute: An attribute composed of many other attributes.

        • Represented by an ellipse, and those ellipses relate to an ellipse.

      4. Multivalued Attribute: An attribute that can have more than one value.

        • Represented by a double oval.

        • Example: A student can have more than one phone number.

      5. Derived Attribute: An attribute that can be derived from another attribute.

        • Represented by a dashed ellipse.

        • Example: A person's age can be derived from their date of birth.

Relationship
  • Used to describe the relation between entities.

  • Represented by a diamond.

Types (Degree) of Relationships
  1. One-to-One Relationship: When only one instance of an entity is associated with the relationship.

    • Example: A female can marry one male, and a male can marry one female.

  2. One-to-Many Relationship: When only one instance of the entity on the left, and more than one instance of an entity on the right associates with the relationship.

    • Example: A student enrolls for only one course, but a course can have many students.

  3. Many-to-Many Relationship: When more than one instance of the entity on the left, and more than one instance of an entity on the right associates with the relationship.

    • Example: An employee can be assigned to many projects, and a project can have many employees.

  4. Recursive Relationships

Examples

Example 1: COMPANY Database

  • Contains information about employees, departments, and projects.

  • The company is organized into departments with a unique name, number, and employee manager.

    • The start date when that employee began managing the department is recorded.

    • A department may have several locations.

  • A department controls several projects, each with a unique name, number, and single location.

  • The database stores each employee’s name, Social Security number, address, salary, sex (gender), and birth date.

    • An employee is assigned to one department but may work on several projects, which are not necessarily controlled by the same department.

    • The current number of hours per week that an employee works on each project is recorded, as well as the direct supervisor of each employee.

  • The database keeps track of the dependents of each employee for insurance purposes, including each dependent’s first name, sex, birth date, and relationship to the employee.

Example 2: Hospital Database

  • Contains information about specialized wards, patients, consultants, nurses, and drugs.

  • A General Hospital consists of several specialized wards, each described by ward_id and Name.

  • The system records Patientid, name, and DateOf_Birth for patients.

  • Each ward may host one or more patients, each hosted by only one ward.

  • Each patient is assigned to one leading consultant but may be examined by another consultant, if required.

  • Each consultant may be assigned one or more patients and may examine one or more patients.

  • Consultants are described by consultant_id and Name.

  • The system records all required data each time the Nurse gives a patient a certain drug with specified dosage at a certain time.

  • Each ward is under the supervision of one nurse, and a nurse may supervise only one ward.

  • Each Nurse must serve in one ward.

  • Data about the nurse is recorded as her name, number, and address.

  • A drug has a code number, recommended dosage, and more than one brand name.

Example 3: Banking System Database

  • Used to control withdrawal, deposit, and loan transactions with customers.

  • Banks using this system have many branches.

  • Each branch has a unique name, address (unique), and phone number.

  • The system stores information about customers as cust_id (unique), name, address, and phones.

  • Each customer has one credit identified by creditno (unique), amount, and lasttransactiondate (day, month, hour), and transactiontime.

  • Customers can make any type of transactions (withdrawal, deposit) from any branch of the bank.

  • The system records transno, trans type, transdate, trans_amount.

  • The system records the branch name where the transaction occurred.

Exercises

  1. Define the following terms: entity, attribute, relationship instance, composite attribute, multivalued attribute, and derived attribute.

  2. True or False

    • ER diagram is used to describe the database graphically. (True)

    • The underlined attribute in E-R diagram represents a primary key. (True)

    • Attributes are types of entities. (False)

    • A composite key is a primary key composed of more than one attribute. (False)

    • All attributes are either simple or composite. (False)

    • All simple attributes are also single-valued. (True)

    • Composite attribute is an attribute value can be derived from another attribute. (False)

    • In ERD, the entity is represented by a rectangle. (True)

    • In ERD, the attribute is represented by a rectangle. (False)

    • In ERD, the weak entity is represented by a double rectangle. (True)

    • In ERD, the composite attribute is represented by a double oval. (False)

    • In ERD, the attribute is represented by an oval. (True)

    • In ERD, the Derived attribute is represented by a double oval. (False)

    • In ERD, the relation between entities is represented by a Diamond. (True)

  3. Consider the ER diagram shown in the following Figure for part of a BANK database.

    • Each bank can have multiple branches, and each branch can have multiple accounts and loans.

      • List the strong (non-weak) entity types in the ER diagram.

      • Is there a weak entity type? If so, give its name, partial key, and identifying relationship.

      • Identify the participation (Optional or mandatory)

  4. Consider the ER diagram shown in the following Figure for School database.

    • Extract (identify) entities and their attributes.

    • Identify the primary key.

    • Identify the relationships.

    • Identify the participation (Optional or mandatory)

Database Schema (Schema Diagrams)

  • Describes how the data is organized and how the relations among them are associated and formulates all the constraints that are to be applied to the data.

ER-to-Relational Schema Mapping Algorithm

  1. For each regular (strong) entity type E in the ERD, create a relation R including all simple attributes of E.

  2. Include only the simple component attributes of a composite attribute.

  3. Choose one key attribute of E as the primary key for R. If the chosen key is composite, the set of simple attributes that form it will together form the primary key of R.

  4. For each weak entity type W in the ERD with owner entity type E, create a relation R, and include all simple attributes (or simple components of composite attributes) of W as attributes of R.

  5. Include as foreign key attributes of R the primary key attribute(s) of the relation(s) that correspond to the owner entity type(s); this takes care of the identifying relationship type of W. The primary key of R is the combination of the primary key(s) of the owner(s) and the partial key of the weak entity type W, if any.

  6. For each binary 1:1 relationship type R in the ER schema, identify the relations S and T that correspond to the entity types participating in R. Choose one of the relations—S, say—and include as foreign key in S the primary key of T. Include all the simple attributes (or simple components of composite attributes) of the 1:1 relationship type R as attributes of S. NOTE: It is better to choose an entity type with total participation in R in the role of S.

  7. For each regular binary 1: N relationship type R, identify the relation S that represents the participating entity type at the N-side of the relationship type. Include as foreign key in S the primary key of the relation T that represents the other entity type participating in R.

  8. For each binary M:N relationship type R, create a new relation S to represent R. Include as foreign key attributes in S the primary keys of the relations that represent the participating entity types; their combination will form the primary key of S. Note: We cannot represent an M:N relationship type by a single foreign key attribute in one of the participating relations—as we did for 1:1 or 1:N relationship types.

  9. For each multivalued attribute A, create a new relation R. This relation R will include an attribute corresponding to A, plus the primary key attribute K—as a foreign key in R—of the relation that represents the entity type or relationship type that has A as an attribute. The primary key of R is the combination of A and K. If the multivalued attribute is composite, we include its simple components.

  10. Create a table named the same as the independent or strong entity name, where the entity's attributes serve as the table's columns.

  11. Eliminate the derived attribute, which is drawn in the ERD as a dashed oval, and it does not appear as a field in the table.

  12. A composite field is not written, but its subfields forming it are written.

  13. Designate a field as the primary key (PK) for the table, which has a value in each row and its value in each row is unique from the values in the remaining rows of the table, i.e., there is no duplication.

  14. When there is more than one field that meets the criteria for being a primary key, these fields are called candidate key.

  15. If the primary key conditions do not apply to a field, more than one field is specified so that the primary key conditions are met.

  16. To convert a multi-valued field, a table is created for the multi-valued field containing the primary key of the table from which it is derived, followed by the field itself (the multi-valued field), and the two fields become the primary key of the new table (i.e., the primary key is a composite field), and the name of the new table is derived from the base table followed by _ followed by the name of the multi-valued field, as in the telephone field in the employee in the previous drawing.

  17. To convert the weak entity, a table is created for the dependent (weak) entity containing first the primary key of the strong entity followed by the partial key of the dependent entity, and both form a composite key that acts as a single primary key for the weak table, and we complete the transformations according to the remaining rules.

  18. converting 1:N relationship: the entity that is directly preceded by one is the one that participates once, and its key goes as a foreign key (FK) to the other entity that is directly preceded by many and participating multiple times.

  19. Converting M:N relationship: represented in a separate table containing the primary key of the first entity followed by the primary key of the second entity, with each of them as a foreign key, and becoming a combined key acting as the primary key for the new table.

  20. The field on any relationship goes with the foreign key field

  21. Converting a 1: 1 relation, the primary key field in the first table is added as a foreign field to the second table, so as to achieve the lowest number of empty values Null in the foreign field. In the following example, this is illustrated, as the department is mandatory to have a manager, but the employee is optional to be the director of a department, and since it is mandatory for the department to have employees, the foreign key is directed to the department table, as shown.

  22. Converting the Recursive Relationship a foreign key enters the same table that contains the primary key if it is one to many or one to one, and if it is many to many, it is separated into a new table with two foreign keys representing the primary key of the table, both of which are the primary key of the new table.

We use the COMPANY database example to illustrate the mapping procedure. The COMPANY schema is shown again. Map the following ERD of Bank database into a relational database schema.

There is an error in the previous conversion, where the primary key pk in the Department, project in ERD was two columns and in the conversion it became one column

SQL

  • SQL is a language to operate databases; it includes database creation, deletion, fetching rows, modifying rows, etc.

  • SQL is Structured Query Language, which is a computer language for storing, manipulating, and retrieving data stored in a relational database.

  • All the Relational Database Management Systems (RDMS) like MySQL, MS Access, Oracle, Sybase, Informix, Postgres and SQL Server use SQL as their standard database language.

SQL Commands

  • The standard SQL commands to interact with relational databases are CREATE, SELECT, INSERT, UPDATE, DELETE and DROP.

  • These commands can be classified into the following groups based on their nature.

Data Definition Language (DDL)
  • Allows the users to define the database, specify the data types, data structures and the constraints on the data to be stored in the database.

  • SQL Commands of DDL are CREATE, ALTER, and DROP.

    • CREATE: Creates a new table, or a new database.

    • ALTER: Modifies an existing table.

    • DROP: Deletes an entire table.

Data Manipulation Language (DML) and Query Language
  • DML allows users to insert, update, delete and retrieve data from the database.

  • SQL Commands of DML are INSERT, UPDATE, DELETE, and SELECT.

    • INSERT: Enter data in a table (Creates a record).

    • UPDATE: Modifies an existing data in a table (Modifies records).

    • DELETE: Deletes a record from the table.

    • SELECT: Retrieves certain records from one or more tables.

Software for Controlled Access of Database
  • Provides the facility of controlled access of the database by the users, concurrency control to allow shared access of the database and a recovery control system to restore the database in case of hardware or software failure.

  • Also called Data Control Language (DCL).

  • SQL Commands of DCL are GRANT and REVOKE.

    • GRANT: Gives a privilege to a user.

    • REVOKE: Takes back privileges granted from a user.

Create a New SQL Database
  • The basic syntax of this statement is as follows:

    CREATE DATABASE databasename;
    
  • For example:

    CREATE DATABASE tests;