Database Systems Notes

Database Systems

Data vs. Information

/

  • Data:

    • Raw facts that have not been processed to reveal their meaning.

  • Information:

    • The result of processing raw data to reveal its meaning.

    • Used to facilitate decision making.

    • Consists of transformed data.

Database Definition

  • A shared, integrated computer structure that houses a collection of:

    • End-user data: Raw facts of interest to the end user.

    • Metadata: Data about data, used to integrate and manage end-user data.

Database Management System (DBMS)

  • Definition:

    • A collection of programs that manages the database structure and controls access to the data stored in the database.

    • Acts as an intermediary between the user and the database.

    • The database structure is stored as a collection of files, accessible only through the DBMS.

Advantages of DBMS

  • Improved data sharing:

    • serves as the intermediary between the user and the database.

    • structure itself is stored as a collection of files

  • Improved data security:

    • the more users access the data, the greater the risks of data security breaches.

  • Better data integration:

    • wider access to well-managed data promotes and integrated view of the organization’s operations

  • Minimized data inconsistency:

    • Data inconsistency: different versions of the same data appearing in different places.

  • Improved data access:

    • possible quick answers to ad hoc queries.

    • Query: a specific request issued to the DBMS for data manipulation (e.g., read or update the data).

  • Improved decision making:

    • Better-managed data and improved data access lead to generate better-quality information.

    • The quality of information depends on the quality of the underlying data.

  • Increased end-user productivity:

    • Availability of data and tools transform data to make quick, informed decisions.

Types of Databases

  • Classification based on the number of users:

    • Single-user database: Supports only one user at a time.

      • Desktop database: A single-user database that runs on a personal computer.

    • Multiuser database: Supports multiple users at the same time.

      • Workgroup database: Supports a relatively small number of users or a specific department within an organization.

      • Enterprise database: Used by the entire organization and supports many users across many departments.

  • Classification based on location:

    • Centralized database: Supports data located at a single site.

    • Distributed database: Supports data distributed across several different sites.

  • Cloud database: Created and maintained using cloud services (e.g., Microsoft Azure, Amazon AWS).

  • Classification based on purpose:

    • General-purpose database: Contains a wide variety of data used in multiple disciplines.

    • Discipline-specific database: Contains data focused on specific subject areas.

    • Operational database: Designed primarily to support a company's day-to-day operations.

    • Analytical database: Focused on storing historical data and business metrics for tactical or strategic decision making.

Importance of Database Design

  • Database design focuses on the structure that will be used to store and manage end-user data.

  • A well-designed database is crucial for meeting user requirements.

  • Insufficient attention to database design can occur due to:

    • Insufficient specifications and/or poor logical data modeling.

    • Not enough time in the development schedule.

    • Too many changes during the development cycle.

    • Database design assigned to novices.

  • Transforming a logical design to a physical database includes:

    • Transforming entities into tables.

    • Transforming attributes into columns.

    • Transforming domains into data types and constraints.

    • Transforming relationships into primary and foreign keys.

File System Data Processing Issues

  • File systems were an improvement over manual systems but had limitations.

  • Understanding the shortcomings of file systems helps in understanding the development of modern databases.

  • Problems with file systems:

    • Lengthy development times:

      • Simple data-retrieval tasks require extensive programming.

      • Older file systems required programmers to specify what must be done and how to do it.

      • Modern databases use nonprocedural data manipulation languages, allowing users to specify what must be done without specifying how.

    • Difficulty of getting quick answers:

      • The need to write programs for even simple reports makes ad hoc queries impossible.

    • Complex system administration:

      • System administration becomes difficult as the number of files expands.

    • Lack of security and limited data sharing:

      • Data sharing and security are closely related.

      • Sharing data among multiple users introduces security risks.

    • Extensive programming:

      • Making changes to an existing file structure can be difficult in a file system environment.

Data Dependence and Independence

  • Structural dependence:

    • A change in the database schema affects data access, requiring changes in all access programs.

  • Structural independence:

    • Changes in the database schema do not affect data access.

  • Data dependence:

    • Data representation and manipulation are dependent on the physical data storage characteristics.

  • Data independence:

    • Data access is unaffected by changes in the physical data storage characteristics.

Data Redundancy

  • Data redundancy: The same data is stored unnecessarily at different places.

  • Uncontrolled data redundancy leads to:

    • Poor data security:

      • Multiple copies of data increase the chances of unauthorized access.

    • Data inconsistency:

      • Different and conflicting versions of the same data appear in different places.

    • Data-entry errors:

      • More likely to occur when complex entries are made in several different files or recur frequently in one or more files.

    • Data integrity problems:

      • Errors such as entering a nonexistent sales agent's name and phone number.

Data Anomalies

  • Data abnormality: Inconsistent changes have been made to a database.

  • Develops when not all required changes in the redundant data are made successfully.