Database Management Systems – Unit 1 Notes
Overview of DBMS
DBMS = Database + Management System
Database: collection of inter-related data.
Management System: set of programs that store & retrieve data efficiently.
Formal definition: “A DBMS is a collection of inter-related data and a set of programs to store and access those data in an easy and effective manner.”
Core purpose: manage data for organisations (e.g., University storing students, teachers, courses, books, etc.)
Need for DBMS
Storage efficiency
Redundant (duplicate) data removed → occupies less space.
Example: A person with two bank accounts stored once instead of twice.
Fast retrieval
Structured querying provides quick access.
Basic data-management operations supported
Add, store, update, delete, and retrieve data.
Applications of DBMS (Industry Examples)
Airlines: reservations, flight schedules.
Telecom: call detail records, customer data, network usage.
Universities: course registration, results, grades.
Sales: products, purchases, customer relationships.
Banking: all financial transactions.
Education sector: staff, student, course information.
Online shopping: product catalogue, billing, shipping.
File Processing System vs. DBMS
Advantages of File Processing System
Cost-friendly (often free, OS-supplied tools).
Easy to learn and use.
Highly scalable in file size.
Drawbacks of File Processing System
Data redundancy → higher storage cost & slower access.
Data inconsistency (mismatched copies after updates).
Data isolation (scattered files, varied formats → complex coding for retrieval).
Difficulty in ad-hoc access; every new requirement needs a new program.
Application-data dependency (file changes ⇒ program changes).
Integrity problems: enforcing complex constraints across files is hard.
Atomicity issues: hard to ensure “all-or-nothing” execution (e.g., fund transfer).
Concurrency anomalies: uncontrolled simultaneous updates yield inconsistency.
Security limitations: fine-grained user access difficult.
Advantages of DBMS over File System
Eliminates redundancy via normalization.
Enforces data consistency & integrity.
Easier implementation of security & privacy (access control).
Faster, easier data access using declarative queries.
Built-in backup & recovery → simple disaster recovery.
Greater flexibility and extensibility.
Data Abstraction & Three-Level Architecture
Purpose: hide complexity and provide user-specific views.
Levels of abstraction:
Internal / Physical Level
Lowest level; describes how data are physically stored (files, indices, byte offsets).
Conceptual / Logical Level
Middle level; describes what data are stored and relationships.
Example record-type declaration:
External / View Level
Highest level; tailored views for end-users or programs, hiding irrelevant details (e.g., salary hidden from students).
Multiple external views can coexist → limitless personalized perspectives.
Three-Schema Architecture (Mapping of Schemas)
Internal Schema = Physical Schema (maps conceptual schema to storage structures).
Conceptual Schema = Logical Schema (overall logical design).
External Schema = View Schema (user-specific views).
Objectives
Users share one database yet see customized views.
Users unaffected by storage details.
DBA can change storage without altering user views.
Physical changes should not disturb logical structure.
Schema vs. Instance
Schema: formal design/blueprint (structure, rules, relationships).
Three kinds: Physical, Logical, View schemas.
Relatively stable; rarely changes.
Analogy: standard-house plan or table template.
Instance: actual data stored at a specific moment.
Highly dynamic; changes with every insert/delete/update.
Analogy: individual built houses or data-filled table rows.
Question answered: It is the instance that changes frequently, not the schema.
Data Independence
Ability to change one schema level without affecting higher levels.
Two flavours:
Physical Data Independence
Modify physical storage (e.g., new indexes, file organisation, SSD swap) without changing logical or external schemas.
Enables performance tuning.
Logical Data Independence
Modify logical schema (e.g., add/remove attributes, apply constraints) without altering external views or application code.
Visual summary (as presented):
Importance: supports evolution of DBMS while protecting applications and users from cascading changes.
Database Languages (SQL Components)
In practice, SQL integrates all components, but conceptually we distinguish four sub-languages.
1. Data Definition Language (DDL)
Defines database schema objects (tables, indexes, constraints).
Major commands: CREATE, ALTER, DROP, TRUNCATE.
Example:
CREATE TABLE instructor (
ID CHAR(5),
name VARCHAR(20),
dept_name VARCHAR(20),
salary NUMERIC(8,2)
);
DDL compiler produces templates stored in data dictionary (metadata repository) containing schema, integrity constraints, authorizations.
2. Data Manipulation Language (DML)
Accesses & manipulates data.
Commands: SELECT, INSERT, UPDATE, DELETE.
3. Data Control Language (DCL)
Grants or revokes privileges.
Commands: GRANT, REVOKE.
4. Transaction Control Language (TCL)
Manages transaction boundaries & persistence.
Commands: COMMIT (make changes permanent), ROLLBACK (undo changes).
Analogies & Metaphors Mentioned
Data abstraction ≈ using a remote control to operate a TV without knowing internal circuits.
Schema ≈ blueprint of the TV showing all components.
Schema vs. Instance: blueprint (schema) vs. actual built houses (instances).
Practical / Ethical / Security Implications
Security & privacy: DBMS provides fine-grained access control (students cannot read teacher payroll).
Integrity constraints: ensures business rules (e.g., department account balance ≥ 0).
Concurrency control: prevents anomalies such as two withdrawals reading same balance (though ACID details covered in later units, concurrency motivation stated in file-system drawbacks).
Connections to Later Units (Preview)
Relational design theory, normalization (1NF ⇒ BCNF ⇒ 4NF) builds on the logical schema concepts introduced here.
SQL forms the foundation for Intermediate & Advanced SQL in Unit-2.
ACID, serializability, and NoSQL topics in Unit-3 rely on understanding data independence and abstraction layers.