Introduction to Database Systems

Data and Information

  • Data: Raw facts or details; can be text, numbers, figures, images, etc.
    • Example: Temperatures, list of names, weights, or prices.
    • Meaningless on its own.
  • Types of Data:
    • Quantitative: Numbers (e.g., weight, age, price).
    • Qualitative: Descriptions (e.g., eye color, name, gender).
  • Information: Analyzed and organized data that has meaning and value.
    • Example: Analyzing temperature data to understand climate trends.
  • Data ➜ Analyzed ➜ Information
  • Example:
    • Data: 32°C, 34°C, 30°C, 31°C.
    • Information: "The average temperature in May is 31.75°C."

Importance of Database

  • Efficient Data Management: Organized storage for quick access.
  • Data Integrity and Accuracy: Rules (primary/foreign keys) ensure correctness.
    • Example: No duplicate student IDs.
  • Data Security: Access control via passwords and permissions.
  • Minimized Data Redundancy: Avoids duplicate data storage.
    • Example: Student name stored once.
  • Concurrent Access: Multiple users can access simultaneously without issues.
  • Backup and Recovery: Automatic data saving and restoration.
  • Data Independence: Storage changes don't break applications.
  • Easy Data Retrieval: SQL queries for quick answers.
    • Example: "Who got the highest marks?"
  • Support for Transactions (ACID properties):
    • Atomicity: All or nothing.
    • Consistency: Data stays correct.
    • Isolation: No interference between users.
    • Durability: Data stays safe after completion.

Limitations of File Systems

  • Data Redundancy: Same data stored in multiple files.
    • Example: Student’s name in multiple files.
  • Data Inconsistency: Updates missed across files.
  • No Data Integrity Checks: No automatic correctness rules.
    • Example: Age entered as 500 without checks.
  • No Concurrent Access: Multiple users can’t safely update simultaneously.
  • Weak Security: Simple read/write protections only.
  • No Backup & Recovery: Manual backups required.
  • No Support for Complex Queries: Requires detailed code.
  • Poor Scalability: Hard to manage large data.
  • Tight Coupling of Data & Program: File format changes require program changes.
  • Difficult Relationship Management: Complex links between data.
    • Example: Linking students and courses.

History of Database Systems

  • 1960s – The Beginning
    • Charles Bachman created IDS (Integrated Data Store).
    • IBM created IMS (Information Management System).
    • Early navigational databases.
      • Hierarchical model: Parent-child relationships.
      • Network model: Records with multiple parents and children.
  • 1970s – Relational Databases
    • Invented by Ted Codd.
      • Data in tables (rows and columns).
      • Easy to search.
      • More space-efficient.
    • Ted Codd won the ACM Turing Award.
    • IBM developed System R, the first relational database with SQL.
  • 1980s – Object-Oriented Databases
    • Data treated as objects.
    • OODBMS.
    • Worked well with Java, C++.
  • 1990s – The Internet Era
    • Increased demand for databases due to WWW.
    • Client-server databases became popular.
    • MySQL launched in 1995 (free, open-source).
  • 2000s – NoSQL Databases
    • NoSQL term appeared in 1998.
    • For unstructured data.
    • Great for large & fast-growing data.
    • Popular with web apps and cloud computing.
  • 2010s – Big Data & Cybersecurity
    • Huge data amounts = Big Data.
    • Databases became distributed.
    • Increased focus on security.
  • Timeline Summary:
    • 1960s: Navigational Databases (IDS, IMS)
    • 1970s: Relational Databases (SQL, System R)
    • 1980s: Object-Oriented Databases
    • 1990s: Internet & MySQL
    • 2000s: NoSQL for Unstructured Data
    • 2010s: Big Data, Distributed DBs, Cybersecurity

Functions of a DBMS

  • Data Storage, Retrieval, and Update
    • Stores data in tables.
    • SQL queries for searching and updating.
    • Example: Update student address.
  • User Access Control / Security Management
    • Controls access.
    • Logins and passwords.
    • Example: Teachers update marks; students view only.
  • Data Integrity Management
    • Ensures correctness.
    • Primary Key (unique ID).
    • Example: Can’t enter marks for a non-existent student ID.
  • Transaction Management
    • Handles actions safely (ACID properties).
      • Atomicity: All or nothing.
      • Consistency: Data remains valid.
      • Isolation: No interference.
      • Durability: Once saved, stays saved.
    • Example: Canceling transaction if the power goes off.
  • Concurrency Control
    • Many users at once.
    • Prevents update conflicts.
    • Example: Two teachers updating marks for different subjects.
  • Backup and Recovery
    • Regular backups.
    • Recovers data after crashes.
    • Example: Restore from yesterday’s backup.
  • Data Abstraction
    • Hides complexity.
      • Physical level: How data is stored.
      • Logical level: What data is stored (tables, columns).
      • View level: What users see (custom views).
    • Example: Students see only their records.
  • Data Independence
    • Change structure without changing the app.
      • Logical independence: Table design changes.
      • Physical independence: Storage changes.
    • Example: Rename column without changing the program.
  • Data Dictionary Management
    • Metadata management.
    • Table names, column types, relationships.
    • Example: