Database Management Systems – Week 1 Comprehensive Notes

Introduction to Data vs. Information

  • Data

    • Raw, unprocessed facts, figures, observations, statistics and values.

    • Examples: individual exam scores, customer phone numbers, sensor readings.

  • Information

    • Data that has been processed or interpreted so it gains meaning and context for decision-making.

    • Information=f(Data,Context,Processing)\text{Information} = f(\text{Data}, \text{Context}, \text{Processing})

  • Key takeaway: Data ≠ Information; processing is required to convert the former into the latter.

Database & DBMS Fundamentals

  • Database

    • Structured record-keeping system that stores inter-related data sets.

    • Enables storage, retrieval and management of large volumes of information (e.g., Student, Subject, Faculty, Marks tables).

  • DBMS (Database Management System)

    • Software suite that creates, accesses, manages and secures databases.

    • Acts as the interface between application programs and physical data.

    • Core operations: Add/Insert, Modify/Update, Delete, Search, Report generation.

  • Why use a DBMS instead of simple files?

    • Minimises redundancy; enforces consistency; supports multiple users; provides security, backup & recovery; offers query language; optimises performance.

Core Characteristics of Modern DBMS

  • Treats data as real-world entities with attributes & behaviour.

  • Represents entities/relations as TABLES (relation-based model).

  • Data & application isolation via metadata; changes to data structure rarely require application rewrites.

  • Supports transactions (ACID), security, concurrency, query languages, and data independence.

  • Reduces redundancy, assures consistency, allows concurrent multi-user access.

Where Databases Appear (Real-World Domains)

  • Banking (accounts, transactions).

  • Airlines (reservations, schedules).

  • Universities (registrations, grades).

  • E-commerce & online retail (orders, recommendations).

  • Manufacturing & supply-chain (inventory, production, orders).

  • HR (employee records, payroll, tax).

Common DBMS Products

  • Commercial RDBMS: Oracle, Microsoft SQL Server, IBM DB2.

  • Open-source RDBMS: MySQL, PostgreSQL, MariaDB.

  • NoSQL / Document: MongoDB.

  • Desktop: Microsoft Access.

Types of Database Systems

  • RDBMS (Relational DBMS)

    • Data organised as tables (relations) with rows (tuples) and columns (attributes).

    • Uses primary keys (PK) & foreign keys (FK) to enforce relationships.

    • Example mini-schema:

    • Students(ID#, Name, Phone, DOB)

    • Courses(ClassID, Title, Credits)

    • Takes_Course(ID#, ClassID, Semester)

  • OODBMS (Object-Oriented DBMS)

    • Stores objects that encapsulate state (data) + behaviour (methods).

    • Can persist complex types: photos, signatures, audio, video, geometries.

  • ORDBMS (Object-Relational DBMS)

    • Hybrid; extends relational engine with object features (user-defined types, inheritance, multimedia columns).

    • Example domain: Bank with shared CITY/STATE/COUNTRY domain across Customer, Employee, Account tables.

Traditional File System Overview

  • Electronic data kept as independent files.

  • Flat file = single table in one file; searching requires row-by-row parsing into an in-memory array → inefficient.

  • File types in legacy processing:

    • Master files: rarely updated (e.g., Employee-Master).

    • Transaction files: day-to-day activity (e.g., Loan release records).

    • Table files: periodically updated reference tables (e.g., Interest Rate table).

    • Security/Recovery files: backup snapshots.

File System vs. Database System (Comparative)

  • Data redundancy & inconsistency: HIGH in FMS, MINIMAL in DBMS.

  • Data isolation/scattering: Present in FMS, eliminated in DBMS via unified schema.

  • Transactions & concurrency: Limited/none in FMS, full ACID & locking in DBMS.

  • Security: Basic OS-level in FMS vs granular privileges in DBMS.

  • Recovery: Manual & error-prone in FMS vs automated logging & backup in DBMS.

Problems Addressed by DBMS

  • Data Redundancy

    • Duplicate data across departments (e.g., student phone stored in Admin, Accounts, Exams) wastes space.

  • Data Inconsistency

    • Updates in one location not propagated to others → conflicting versions.

  • Concurrent Access & Recovery

    • Multiple users (ATMs) need simultaneous balance updates; DBMS locks/transactions preserve integrity.

    • Recovery subsystem tracks free disk blocks, directory trees, journals for crash restore.

Data Models

Hierarchical Model

  • Parent-child tree; single parent per child.

  • Fast, predictable access but rigid links; poor for many-to-many.

  • Example: College → Department → Programme → Student.

Network Model

  • Extends hierarchy: record can have multiple parents; supports 1:1, 1:N, M:N.

  • More flexible but link maintenance complex.

  • Example: ITEM related to both STORE and TRANSACTION segments.

Relational Model

  • Data stored in 2-D tables; SQL used for manipulation.

  • Advantages: structural flexibility, multiple user views, strong mathematical foundation.

  • Limitation: Distributed partitioning needs extra mechanisms.

Object-Oriented Model

  • Everything is an OBJECT with attributes & methods.

  • Allows user-defined access, custom data types; suited for complex multimedia data.

  • Challenges: Lacks universal standard model; fewer mature tools.

DBMS Architectures

One-Tier (Single-Tier)

  • Presentation + Business + Data in one machine (e.g., MS-Access, desktop games).

Two-Tier (Client/Server)

  • Client = GUI (presentation).

  • Server = combined Business logic + Database.

  • Clients communicate via API/ODBC for direct, faster calls.

Three-Tier (n-Tier)

  • Separate layers:

    1. Presentation (browser/mobile).

    2. Business/Application server (validation, calculations).

    3. Database server (data persistence).

  • Used by web apps (e.g., Facebook); promotes scalability, multiple views, security.

Schema Levels & Data Abstraction

  • Internal (Physical): disk blocks, indices, file structures.

  • Conceptual (Logical): global schema; entities, relationships, constraints.

  • External (View): user-specific subsets or perspectives.

  • Benefit: Data Independence – ability to change one level without affecting others.

Core Components Inside a DBMS

Query Processor

  • DDL Interpreter: parses schema definitions → data dictionary.

  • DML Compiler: converts SQL to low-level evaluation plan.

  • Query Optimizer: chooses least-cost plan.

  • Query Evaluation Engine: executes plan.

Storage Manager

  • Interface between logical requests & physical data.

  • Sub-modules:

    • Authorization & Integrity Manager (security, constraints).

    • Transaction Manager (ACID, concurrency control).

    • File Manager (disk space allocation, directory structure).

    • Buffer Manager (caches disk blocks in RAM; vital for large datasets).

Disk Storage Elements

  • Data files: actual table rows.

  • Data dictionary: schema metadata.

  • Indices: auxiliary structures for fast lookup.

People in the DB Environment

  • Naïve Users: invoke predefined apps (ATM teller, web customer).

  • Application Programmers: write code; may use RAD 4GL tools.

  • Sophisticated Users: analysts using ad-hoc SQL/OLAP.

  • Specialized Users: build non-traditional apps (AI knowledge bases, graphics, audio DBs).

  • System Administrators: manage hardware/OS.

  • Database Designers/Analysts: model schema & constraints.

  • Database Administrators (DBA)

    • Select HW/DB products, set standards, tune performance, ensure security, backup/recovery, permissions.

Contemporary Challenges in DBMS Construction

  1. Data Security

    • >100k publicly exposed DB instances hacked recently.

    • Must implement built-in DB security + organisational controls.

  2. Performance

    • Growing data volumes with stricter response times; DB should deliver high native performance even on modest hardware.

  3. Data Safety / Integrity

    • Must guarantee no data loss while maintaining throughput; balance between ACID and speed.

  4. Resource Utilization

    • Optimise CPU, RAM, I/O to extract maximum work per nanosecond.

  5. High Availability

    • Clustering, replication to avoid single-server downtime; akin to adding extra cash registers during rush hour.

Ethical & Practical Implications

  • Protecting confidential data (GDPR, HIPAA compliance) is both a legal and moral requirement.

  • Accurate, consistent data underpins fair decisions (e.g., loan approvals).

  • Efficient resource usage lowers energy footprint, contributing to sustainability.

Quick Numerical / Formal References

  • ACID properties:

    1. Atomicity

    2. Consistency

    3. Isolation

    4. Durability

  • Basic SQL selection: \text{SELECT } * \text{ FROM Students WHERE Age} > 18

Summary Checklist (Week 1)

  • Differentiated Data vs Information.

  • Defined Database, DBMS, RDBMS, OODBMS, ORDBMS.

  • Analysed File types and limitations vs DBMS strengths.

  • Explored four classical Data Models.

  • Detailed 1-, 2- and 3-tier Architectures; Internal–Conceptual–External schemas.

  • Mapped DBMS internals: Query Processor, Storage Manager, Disk structures.

  • Identified user roles & DBA responsibilities.

  • Highlighted five modern DB challenges: Security, Performance, Safety, Resource, Availability.

Use these structured notes as a comprehensive replacement for the original Week-1 slides and lectures.