Database Systems Notes
Database Systems
Data vs. Information
/
Data:
Raw facts that have not been processed to reveal their meaning.
Information:
The result of processing raw data to reveal its meaning.
Used to facilitate decision making.
Consists of transformed data.
Database Definition
A shared, integrated computer structure that houses a collection of:
End-user data: Raw facts of interest to the end user.
Metadata: Data about data, used to integrate and manage end-user data.
Database Management System (DBMS)
Definition:
A collection of programs that manages the database structure and controls access to the data stored in the database.
Acts as an intermediary between the user and the database.
The database structure is stored as a collection of files, accessible only through the DBMS.
Advantages of DBMS
Improved data sharing:
serves as the intermediary between the user and the database.
structure itself is stored as a collection of files
Improved data security:
the more users access the data, the greater the risks of data security breaches.
Better data integration:
wider access to well-managed data promotes and integrated view of the organization’s operations
Minimized data inconsistency:
Data inconsistency: different versions of the same data appearing in different places.
Improved data access:
possible quick answers to ad hoc queries.
Query: a specific request issued to the DBMS for data manipulation (e.g., read or update the data).
Improved decision making:
Better-managed data and improved data access lead to generate better-quality information.
The quality of information depends on the quality of the underlying data.
Increased end-user productivity:
Availability of data and tools transform data to make quick, informed decisions.
Types of Databases
Classification based on the number of users:
Single-user database: Supports only one user at a time.
Desktop database: A single-user database that runs on a personal computer.
Multiuser database: Supports multiple users at the same time.
Workgroup database: Supports a relatively small number of users or a specific department within an organization.
Enterprise database: Used by the entire organization and supports many users across many departments.
Classification based on location:
Centralized database: Supports data located at a single site.
Distributed database: Supports data distributed across several different sites.
Cloud database: Created and maintained using cloud services (e.g., Microsoft Azure, Amazon AWS).
Classification based on purpose:
General-purpose database: Contains a wide variety of data used in multiple disciplines.
Discipline-specific database: Contains data focused on specific subject areas.
Operational database: Designed primarily to support a company's day-to-day operations.
Analytical database: Focused on storing historical data and business metrics for tactical or strategic decision making.
Importance of Database Design
Database design focuses on the structure that will be used to store and manage end-user data.
A well-designed database is crucial for meeting user requirements.
Insufficient attention to database design can occur due to:
Insufficient specifications and/or poor logical data modeling.
Not enough time in the development schedule.
Too many changes during the development cycle.
Database design assigned to novices.
Transforming a logical design to a physical database includes:
Transforming entities into tables.
Transforming attributes into columns.
Transforming domains into data types and constraints.
Transforming relationships into primary and foreign keys.
File System Data Processing Issues
File systems were an improvement over manual systems but had limitations.
Understanding the shortcomings of file systems helps in understanding the development of modern databases.
Problems with file systems:
Lengthy development times:
Simple data-retrieval tasks require extensive programming.
Older file systems required programmers to specify what must be done and how to do it.
Modern databases use nonprocedural data manipulation languages, allowing users to specify what must be done without specifying how.
Difficulty of getting quick answers:
The need to write programs for even simple reports makes ad hoc queries impossible.
Complex system administration:
System administration becomes difficult as the number of files expands.
Lack of security and limited data sharing:
Data sharing and security are closely related.
Sharing data among multiple users introduces security risks.
Extensive programming:
Making changes to an existing file structure can be difficult in a file system environment.
Data Dependence and Independence
Structural dependence:
A change in the database schema affects data access, requiring changes in all access programs.
Structural independence:
Changes in the database schema do not affect data access.
Data dependence:
Data representation and manipulation are dependent on the physical data storage characteristics.
Data independence:
Data access is unaffected by changes in the physical data storage characteristics.
Data Redundancy
Data redundancy: The same data is stored unnecessarily at different places.
Uncontrolled data redundancy leads to:
Poor data security:
Multiple copies of data increase the chances of unauthorized access.
Data inconsistency:
Different and conflicting versions of the same data appear in different places.
Data-entry errors:
More likely to occur when complex entries are made in several different files or recur frequently in one or more files.
Data integrity problems:
Errors such as entering a nonexistent sales agent's name and phone number.
Data Anomalies
Data abnormality: Inconsistent changes have been made to a database.
Develops when not all required changes in the redundant data are made successfully.