module 1 dbms

Database Management Systems (BCS403) - 2024-25

Module Overview

  • Instructor: Dr. Narender M

  • Position: Associate Professor, Department of Computer Science and Engineering

  • Institution: The National Institute of Engineering

Course Topics

  • Introduction to Databases

    • Characteristics of the database approach

    • Advantages of using the DBMS approach

    • History of database applications

  • Data Models, Schemas, and Instances

    • Database languages and interfaces

    • Database system environment

  • Overview of Database Languages and Architectures

    • Entity types, entity sets, and structural constraints

    • Weak entity types

    • ER diagrams

    • Specialization and Generalization

    • Conceptual Data Modeling using Entities and Relationships

Academic Calendar

  • Includes: Syllabus, Evaluation Pattern, Abridged Lesson Plan

Introduction to Databases

  • Definition: A database is a collection of related data that represents known facts with implicit meaning.

  • Example: Names, telephone numbers, addresses.

  • Storage: Can be indexed in an address book or stored in software like Microsoft Access or Excel.

Implicit Properties of Databases
  • Real World Representation: A database reflects some aspect of the real world (miniworld or universe of discourse).

  • Coherent Collection: Data must be logically coherent and meaningful.

  • Specific Purpose: Designed for a specific group of users and applications.

Accuracy and Reliability
  • Must reflect the miniworld accurately and update changes promptly.

  • Can be of any size and complexity; can be maintained manually or computerized.

Database Management System (DBMS)
  • Definition: A computerized system enabling the creation and maintenance of databases.

  • General-Purpose Software: Facilitates defining, constructing, manipulating, and sharing databases.

  • Defining: Involves specifying data types, structures, and constraints, stored as metadata in a catalog or dictionary.

Database Functions
  1. Constructing: Storing data on a medium controlled by the DBMS.

  2. Manipulating: Querying, updating, and generating reports from the database.

  3. Sharing: Multiple users and programs access the database simultaneously, referring to the combined software as a database system.

Characteristics of the Database Approach

  • Traditional File Processing vs. Database Approach:

    • In file processing, users define and implement files, leading to redundancy and wasted storage.

    • In a database approach, a single repository maintains data defined once and accessed repeatedly through queries and transactions.

Main Characteristics

  1. Self-describing Nature

    • Contains descriptions of database structure and constraints (meta-data stored in the DBMS catalog).

    • Newer NOSQL systems may not require meta-data as they store self-describing data.

  2. Insulation Between Programs and Data

    • Changes in data structures do not typically necessitate changes in application programs due to program-data independence.

    • Users can define operations on data as part of database definitions (in object-oriented database systems).

  3. Support of Multiple Views of Data

    • Users may require different perspectives; views can be subsets or derived virtual data.

  4. Sharing of Data and Multiuser Transaction Processing

    • Multiuser DBMS must permit simultaneous access, ensuring concurrency control and transaction processing (OLTP).

    • Transactions ensure data integrity through isolation (appearing to execute independently) and atomicity (all or nothing execution).

Advantages of Using the DBMS Approach

  • Additional capabilities for design, administration, and use of multiuser databases.

Key Advantages

  1. Controlling Redundancy

    • Traditional processing results in file redundancy causing wasted space and inconsistencies.

    • DBMS aims for storing logical data items in one place (data normalization) while allowing controlled redundancy for performance (denormalization).

  2. Restricting Unauthorized Access

    • Users have varied permissions (e.g., read-only vs. read/update).

    • DBMS should provide security subsystems for account management and access restrictions.

  3. Providing Persistent Storage for Program Objects

    • Allows data persistence beyond program execution, integrating easily with programming languages (C++, Java).

    • Helps in solving the impedance mismatch problem between language and data structures.

  4. Efficient Query Processing

    • Specialized structures (indices) enhance query response speed, with buffering mechanisms for temporary storage in main memory.

    • DBMS query optimization enhances overall efficiency.

  5. Backup and Recovery

    • Essential for recovering from failures, ensuring data restoration post-update failures.

    • Disk backups are vital for catastrophic failures.

  6. Multiple User Interfaces

    • User interfaces should cater to casual users, programmers and DBA with various styles (GUIs, forms, query languages).

  7. Complex Relationship Representation

    • DBMS can define complex relationships and dynamically update as necessary.

  8. Enforcing Integrity Constraints

    • DBMS must ensure integrity constraints are met, such as key uniqueness and referential integrity.

  9. Inferencing and Actions with Rules and Triggers

    • Some DBMS allow definitions for rules that infer new information and triggers that automate actions upon data modifications.

  10. Standards Enforcement

    • Ensures consistency in data element naming, formats, and definitions across an organization.

  11. Reduced Application Development Time

    • Utilizing DBMS capabilities significantly cuts the time for creating new applications.

  12. Flexibility and Up-to-Date Information

    • DBMS can adapt to structure changes, allowing users to view updates immediately, vital in transactional systems (e.g., banks, reservations).

  13. Economies of Scale

    • Consolidation of resources leads to cost savings and allows investments in better technology instead of redundant purchases.

History of Database Applications

  1. Early Applications: Used hierarchical and network systems, mixing conceptual relationships with physical storage; lacked sufficient abstraction and independence.

  2. Relational Databases: Separated physical from conceptual representations; introduced higher-level querying efficiency.

  3. Object-Oriented Databases: Emerged in response to complex object storage needs; integrated into relational systems to create ORDBMSs.

  4. Big Data and NOSQL Databases: Arma of applications required new databases for handling volume, speed, and verification outpacing traditional SQL systems.

Data Models, Schemas, and Instances

  • Data Abstraction: Hiding details of data organization for an improved understanding of essential database features.

  • Data Model: Concepts that describe the structure, manipulative operations, and constraints of databases.

Types of Data Models

  1. High-Level or Conceptual Models: Overall structure, understanding entities, attributes, relationships (e.g., ER models).

  2. Low-Level or Physical Models: Deal with storage aspects (e.g., file structures, indexing).

  3. Representational Models: Bridge between conceptual and physical levels; implementation of conceptual models in DBMS languages.

Key Concepts
  • Database Schema: Description of overall database structure, types, and constraints.

  • Database State: The actual data at a specific moment, differing from schemas which change infrequently compared to the rapidly updated state.

Three-Schema Architecture
  • Proposed to separate user applications from the physical database to promote independence and multiple views.

  • Levels: Internal schema (storage structures), Conceptual schema (user interactions), External schemas (user views).

  • Data Independence:

    • Logical Independence: Changes to the conceptual schema without external schema changes.

    • Physical Independence: Internal schema changes without implications for the conceptual schema.

Database Languages and Interfaces

Types of Database Languages

  1. Data Definition Language (DDL): For defining schemas (CREATE, ALTER, DROP).

  2. Data Manipulation Language (DML): For retrieving and updating data (SELECT, INSERT, DELETE).

  3. Data Control Language (DCL): For controlling access (GRANT, REVOKE).

Database Interfaces

  • User-friendly interfaces include menu-based, form-based, graphical, natural language, speech, and specific designs for database programmers or the DBA.

Entity-Relationship Modeling

Concepts

  1. Entity: Independent existence about which data is collected.

  2. Attributes: Characteristics describing entities; can be simple, composite, single-valued, multi-valued, stored, or derived.

  3. Relationships: Associations between entities categorized by degree (Unary, Binary, Ternary).

  4. Weak Entity Types: Dependent on a strong entity for existence, identified by partial keys.

  5. Specialization and Generalization: Creating subclasses of entities based on distinguishing characteristics or unifying attributes across different types.

ER Diagrams

  • Visual representations of entities, attributes, and their relationships structured to capture functional requirements clearly.

Constraints

  1. Cardinality Ratios: Maximum participation in relationships (1:1, 1:N, M:N).

  2. Participation Constraints: Determine whether entities must be related (Total vs. Partial participation).


Conclusion

The first module of Database Management Systems establishes foundational knowledge covering database definitions, characteristics, DBMS advantages, historical evolution, and data modeling concepts vital for advanced learning in data management practices and systems design.