Database Management Systems – Unit 1 Notes

Overview of DBMS

  • DBMS = Database + Management System

    • Database: collection of inter-related data.

    • Management System: set of programs that store & retrieve data efficiently.

  • Formal definition: “A DBMS is a collection of inter-related data and a set of programs to store and access those data in an easy and effective manner.”

  • Core purpose: manage data for organisations (e.g., University storing students, teachers, courses, books, etc.)

Need for DBMS

  • Storage efficiency

    • Redundant (duplicate) data removed → occupies less space.

    • Example: A person with two bank accounts stored once instead of twice.

  • Fast retrieval

    • Structured querying provides quick access.

  • Basic data-management operations supported

    • Add, store, update, delete, and retrieve data.

Applications of DBMS (Industry Examples)

  • Airlines: reservations, flight schedules.

  • Telecom: call detail records, customer data, network usage.

  • Universities: course registration, results, grades.

  • Sales: products, purchases, customer relationships.

  • Banking: all financial transactions.

  • Education sector: staff, student, course information.

  • Online shopping: product catalogue, billing, shipping.

File Processing System vs. DBMS

Advantages of File Processing System

  • Cost-friendly (often free, OS-supplied tools).

  • Easy to learn and use.

  • Highly scalable in file size.

Drawbacks of File Processing System

  • Data redundancy → higher storage cost & slower access.

  • Data inconsistency (mismatched copies after updates).

  • Data isolation (scattered files, varied formats → complex coding for retrieval).

  • Difficulty in ad-hoc access; every new requirement needs a new program.

  • Application-data dependency (file changes ⇒ program changes).

  • Integrity problems: enforcing complex constraints across files is hard.

  • Atomicity issues: hard to ensure “all-or-nothing” execution (e.g., fund transfer).

  • Concurrency anomalies: uncontrolled simultaneous updates yield inconsistency.

  • Security limitations: fine-grained user access difficult.

Advantages of DBMS over File System

  • Eliminates redundancy via normalization.

  • Enforces data consistency & integrity.

  • Easier implementation of security & privacy (access control).

  • Faster, easier data access using declarative queries.

  • Built-in backup & recovery → simple disaster recovery.

  • Greater flexibility and extensibility.

Data Abstraction & Three-Level Architecture

  • Purpose: hide complexity and provide user-specific views.

  • Levels of abstraction:

    1. Internal / Physical Level

    • Lowest level; describes how data are physically stored (files, indices, byte offsets).

    1. Conceptual / Logical Level

    • Middle level; describes what data are stored and relationships.

    • Example record-type declaration:
      type instructor = record ID:string; name:string; deptname:string; salary:integer\text{type instructor = record }{ ID:string;\ name:string;\ dept_name:string;\ salary:integer }

    1. External / View Level

    • Highest level; tailored views for end-users or programs, hiding irrelevant details (e.g., salary hidden from students).

  • Multiple external views can coexist → limitless personalized perspectives.

Three-Schema Architecture (Mapping of Schemas)

  • Internal Schema = Physical Schema (maps conceptual schema to storage structures).

  • Conceptual Schema = Logical Schema (overall logical design).

  • External Schema = View Schema (user-specific views).

  • Objectives

    • Users share one database yet see customized views.

    • Users unaffected by storage details.

    • DBA can change storage without altering user views.

    • Physical changes should not disturb logical structure.

Schema vs. Instance

  • Schema: formal design/blueprint (structure, rules, relationships).

    • Three kinds: Physical, Logical, View schemas.

    • Relatively stable; rarely changes.

    • Analogy: standard-house plan or table template.

  • Instance: actual data stored at a specific moment.

    • Highly dynamic; changes with every insert/delete/update.

    • Analogy: individual built houses or data-filled table rows.

  • Question answered: It is the instance that changes frequently, not the schema.

Data Independence

  • Ability to change one schema level without affecting higher levels.

  • Two flavours:

    1. Physical Data Independence

    • Modify physical storage (e.g., new indexes, file organisation, SSD swap) without changing logical or external schemas.

    • Enables performance tuning.

    1. Logical Data Independence

    • Modify logical schema (e.g., add/remove attributes, apply constraints) without altering external views or application code.

  • Visual summary (as presented):
    VIEW LEVELLogical IndependenceLOGICAL LEVELPhysical IndependencePHYSICAL LEVEL\text{VIEW LEVEL} \xleftrightarrow{\text{Logical Independence}} \text{LOGICAL LEVEL} \xleftrightarrow{\text{Physical Independence}} \text{PHYSICAL LEVEL}

  • Importance: supports evolution of DBMS while protecting applications and users from cascading changes.

Database Languages (SQL Components)

  • In practice, SQL integrates all components, but conceptually we distinguish four sub-languages.

1. Data Definition Language (DDL)

  • Defines database schema objects (tables, indexes, constraints).

  • Major commands: CREATE, ALTER, DROP, TRUNCATE.

  • Example:

  CREATE TABLE instructor (
      ID         CHAR(5),
      name       VARCHAR(20),
      dept_name  VARCHAR(20),
      salary     NUMERIC(8,2)
  );
  • DDL compiler produces templates stored in data dictionary (metadata repository) containing schema, integrity constraints, authorizations.

2. Data Manipulation Language (DML)

  • Accesses & manipulates data.

  • Commands: SELECT, INSERT, UPDATE, DELETE.

3. Data Control Language (DCL)

  • Grants or revokes privileges.

  • Commands: GRANT, REVOKE.

4. Transaction Control Language (TCL)

  • Manages transaction boundaries & persistence.

  • Commands: COMMIT (make changes permanent), ROLLBACK (undo changes).

Analogies & Metaphors Mentioned

  • Data abstraction ≈ using a remote control to operate a TV without knowing internal circuits.

  • Schema ≈ blueprint of the TV showing all components.

  • Schema vs. Instance: blueprint (schema) vs. actual built houses (instances).

Practical / Ethical / Security Implications

  • Security & privacy: DBMS provides fine-grained access control (students cannot read teacher payroll).

  • Integrity constraints: ensures business rules (e.g., department account balance ≥ 0).

  • Concurrency control: prevents anomalies such as two withdrawals reading same balance (though ACID details covered in later units, concurrency motivation stated in file-system drawbacks).

Connections to Later Units (Preview)

  • Relational design theory, normalization (1NF ⇒ BCNF ⇒ 4NF) builds on the logical schema concepts introduced here.

  • SQL forms the foundation for Intermediate & Advanced SQL in Unit-2.

  • ACID, serializability, and NoSQL topics in Unit-3 rely on understanding data independence and abstraction layers.