Database Concepts - Chapter 1: Database Systems
What is Data?
- Raw, unprocessed facts
- Examples: John, 90, Red, 2025-04-14
- Data constitutes the building blocks of information.
- No meaning without context
- In a school system, '30' could mean a test score or student age
- Raw data must be properly formatted for storage, processing, and presentation. For example, dates might be stored in day-month-year format or month/day/year format.
- Processed data that is meaningful
- Example: 'John scored 30 on his math test'
- Accurate, relevant, and timely information is the key to good decision making.
- Good decision making is the key to organizational survival in a global environment.
- Knowledge: The body of information and facts about a specific subject.
- Knowledge implies familiarity, awareness, and understanding of information as it applies to an environment.
- Figure 1.2 Transforming Raw Data into Information illustrates how raw data becomes information and then knowledge.
Data Management
- A process that focuses on data collection, storage, and retrieval.
- Common data management functions include: addition, deletion, modification, listing.
- Efficient data management typically requires the use of a computer database.
What is the Database?
- A shared, integrated computer structure that houses a collection of related data.
- A database contains two types of data:
- End-user data: raw facts of interest to the end user.
- Metadata: Data about data. It describes the data characteristics and relationships (type, size, constraints).
- Example: StudentID is an integer, primary key, non-null.
What is a DBMS?
- The collection of programs that manages the database structure and controls access to the data stored in the database.
- Serves as the intermediary between the user and the database.
- The database structure is stored as a collection of files that can be accessed through the DBMS.
- Examples: MySQL, Oracle, SQL Server, PostgreSQL.
DBMS Advantages
- Improved data sharing.
- Improved data security.
- Better data integration.
- Minimized data inconsistency. Data inconsistency exists when different versions of the same data appear in different places.
- Improved data access.
- Improved decision making.
- Data quality: A comprehensive approach to ensuring the accuracy, validity, and timeliness of data.
- Increased end-user productivity.
Types of Databases
- A DBMS can be used to build many types of databases. Each database stores a particular collection of data and is used for a specific purpose.
- Databases can be classified by:
- Number of users: single-user database, desktop database, multiuser database, workgroup database, enterprise database
- Data location: centralized database, distributed database, cloud database
- Data type: general-purpose database, discipline-specific database
- Data usage: operational database, analytical database
- Data structure: unstructured data, structured data, semistructured data
Types of Databases (cont.)
- Databases can be classified by: (cont.)
- Data location, data type, data usage, and data structure repeated as needed for emphasis.
- (These slides reiterate the same classification criteria for emphasis.)
Types of Databases (cont.)
- single-user database: A database that supports only one user at a time.
- desktop database: A single-user database that runs on a personal computer.
- multiuser database: A database that supports multiple concurrent users.
- workgroup database: A multiuser database that usually supports fewer than 50 users or is used for a specific department in an organization.
- enterprise database: The overall company data representation, which provides support for present and expected future needs.
Types of Databases (cont.)
- centralized database: A database located at a single site.
- distributed database: A logically related database that is stored in two or more physically independent sites.
- cloud database: A database that is created and maintained using cloud services, such as Microsoft Azure or Amazon AWS.
- general-purpose database: A database that contains a wide variety of data used in multiple disciplines.
Types of Databases (cont.)
- discipline-specific database: A database that contains data focused on a specific subject area.
- operational database: A database designed primarily to support a company’s day-to-day operations. Also known as a transactional database, OLTP database, or production database.
Types of Databases (cont.)
- analytical database: A database focused primarily on storing historical data and business metrics used for tactical or strategic decision making. comprise two main components: a data warehouse and an online analytical processing front end.
- data warehouse: A specialized database that stores historical and aggregated data in a format optimized for decision support.
- online analytical processing (OLAP): A set of tools that provide advanced data analysis for retrieving, processing, and modeling data from the data warehouse.
- business intelligence: A set of tools and processes used to capture, collect, integrate, store, and analyze data to support business decision making.
Types of Databases (cont.)
- unstructured data: Data that exists in its original, raw state; that is, in the format in which it was collected.
- structured data: Data that has been formatted to facilitate storage, use, and information generation.
- semistructured data: Data that has already been processed to some extent.
Types of Databases (cont.)
- Extensible Markup Language (XML): A metalanguage used to represent and manipulate data elements. Unlike other markup languages, XML permits the manipulation of a document’s data elements.
- XML database: A database system that stores and manages semistructured XML data.
- NoSQL (Not only SQL): A new generation of DBMS that is not based on the traditional relational database model. designed to handle the unprecedented volume of data, variety of data types and structures, and velocity of data operations.
Database Design
- The process that yields the description of the database structure and determines the database components.
- The second phase of the database life cycle.
- A well-designed database facilitates data management and generates accurate and valuable information.
- A poorly designed database can lead to poor decision making, and poor decision making can lead to the failure of an organization.
Evolution of File System Data Processing
- Manual File Systems: data was kept in paper-and-pencil manual systems, accomplished through a system of file folders and filing cabinets.
- As organizations grew and reporting requirements became more complex, keeping data in a manual file system became more difficult.
- Computerized File Systems: computer files within the file system. DP (data processing) specialist: The person responsible for developing and managing a computerized file processing system.
Evolution of File System Data Processing (cont.)
- Manual file systems, computerized file systems, database systems.
Evolution of File System Data Processing (cont.)
- Basic File Terminology:
- field: A character or group of characters (alphabetic or numeric) that has a specific meaning. A field is used to define and store data.
- record: A logically connected set of one or more fields that describes a person, place, or thing.
- file: A collection of related records. For example, a file might contain data about the students currently enrolled at Gigantic University.
Problems with File Systems Data Processing
- Lengthy development times.
- Difficulty of getting quick answers.
- Complex system administration.
- Lack of security and limited data sharing.
- Extensive programming.
- structural dependence: A data characteristic in which a change in the database schema affects data access, thus requiring changes in all access programs.
Problems with File Systems Data Processing (cont.)
- data dependence: A data condition in which data representation and manipulation are dependent on the physical data storage characteristics.
- data redundancy: Exists when the same data is stored unnecessarily at different places.
- islands of information: In the old file system environment, pools of independent, often duplicated, and inconsistent data created and managed by different departments.
Problems with File Systems Data Processing (cont.)
- data redundancy (cont.): Uncontrolled data redundancy sets the stage for: Poor data security. Data inconsistency. Data-entry errors. Data integrity problems.
- data integrity: In a relational database, a condition in which the data in the database complies with all entity and referential integrity constraints.
Problems with File Systems Data Processing (cont.)
- data anomaly: A data abnormality in which inconsistent changes have been made to a database. For example, an employee moves, but the address change is not corrected in all files in the database.
Database system
- An organization of components that defines and regulates the collection, storage, management, and use of data in a database environment.
Database system (cont.)
- A Database System [illustrative diagram]: contrasts between Database and File System.
Database system Environment
- Hardware: computers, storage devices, printers, network devices, and other devices.
- Software: Operating system software (examples: Microsoft Windows, Linux, macOS, UNIX, MVS).
- DBMS software: Examples include Microsoft SQL Server, Oracle, IBM DB2.
- Application programs and utility software.
Database system Environment (cont.)
- People: System administrators (DBAs) manage the DBMS and ensure proper functioning; Database designers design the database structure; System analysts and programmers design and implement the application programs; End users use the application programs to run daily operations.
Database system Environment (cont.)
- Procedures: Instructions and rules that govern the design and use of the database system.
- Data: The collection of facts stored in the database.
Figure 1.10 The Database System Environment (textual description)
- Diagrammatic representation of roles: general description of interactions among data, DBMS, administrators, designers, analysts, end users, and hardware/software.
DBMS Functions
- Data dictionary management: A DBMS component that stores metadata—data about data. The data dictionary contains data definitions as well as data characteristics and relationships.
- Data storage management: performance tuning; activities that make a database perform more efficiently in terms of storage and access speed.
DBMS Functions (cont.)
- Data transformation and presentation: The DBMS transforms entered data to conform to required data structures.
- Security management: The DBMS creates a security system that enforces user security and data privacy.
- Multiuser access control: To provide data integrity and data consistency, the DBMS uses sophisticated algorithms to ensure that multiple users can access the database concurrently without compromising its integrity.
DBMS Functions (cont.)
- Backup and recovery management: The DBMS provides backup and data recovery to ensure data safety and integrity.
- Data integrity management: The DBMS promotes and enforces integrity rules, thus minimizing data redundancy and maximizing data consistency.
- Database access languages and application programming interfaces (APIs).
- Query language: A nonprocedural language that is used by a DBMS to manipulate its data. Example: SQL.
DBMS Functions (cont.)
- Database access languages and application programming interfaces (cont.): Structured Query Language (SQL) is a powerful and flexible relational database language composed of commands that enable users to create database and table structures, perform various types of data manipulation and data administration, and query the database to extract useful information.
- Database communication interfaces: application programming interface (API) Software through which applications interact with each other transmitting data, messages, status, etc.
Database Systems Challenges
- Increased costs. Database systems require sophisticated hardware and software and highly skilled personnel.
- Management complexity. Database systems interface with many technologies and have a significant impact on a company’s resources and culture.
- Maintaining currency. To maximize the efficiency of the database system, you must keep your system current.
Database Systems Challenges (cont.)
- Vendor dependence. Given the heavy investment in technology and personnel training, companies might be reluctant to change database vendors.
- Frequent upgrade/replacement cycles. DBMS vendors frequently upgrade their products by adding new functionality.
End of Chapter 1 notes