Database Systems Notes
Data vs. Information
- Data: Raw facts that have not yet been processed to reveal their meaning.
- Considered the building blocks of information.
- Information: The result of processing raw data to reveal its meaning.
- Requires context to be meaningful.
- Forms the bedrock of knowledge, and should be accurate, relevant, and timely.
Introducing the Database
- A shared, integrated computer structure that stores data.
- Consists of:
- End-user data: Raw facts of interest to the end-user.
- Metadata: Data about data which provides description of data characteristics and relationships.
- Utilizes a Database Management System (DBMS) that is a collection of programs to manage the database structure and control access to the data.
Role and Advantages of the DBMS
- The DBMS acts as an intermediary between the user and the database.
- Enables data sharing by presenting an integrated view of the data to the end-user.
- Offers more efficient and effective data management.
- Improves sharing, security, integration, access, decision-making capabilities and overall productivity.
Types of Databases
- Classification by number of users:
- Single-user database: Supports one user at a time; a desktop database falls under this category.
- Multiuser database: Supports multiple users simultaneously; categorized into workgroup (small number of users) and enterprise (many users across many departments) databases.
- Classification by location:
- Centralized database: Data is located at a single site.
- Distributed database: Data is distributed across different sites.
- Cloud database: Created and maintained using cloud data services with defined performance measures.
- Classification by data type:
- General-purpose database: Contains a wide variety of data used in multiple disciplines.
- Discipline-specific database: Contains data focused on specific subject areas.
- Operational database: Supports a company’s day-to-day operations.
- Analytical database: Stores historical data and business metrics used for tactical or strategic decision-making.
- Data warehouse: Stores data optimized for decision support.
- Online analytical processing (OLAP): Tools for retrieving, processing, and modeling data from the data warehouse.
- Business intelligence: Captures and processes business data to generate information that supports decision-making.
- Classification by degree of structure:
- Unstructured data: Exists in its original (raw) state.
- Structured data: Results from formatting, with structure applied based on the type of processing to be performed.
- Semistructured data: Processed to some extent, often using Extensible Markup Language (XML) to represent data elements in textual format.
Why Database Design Is Important
- Database design focuses on the structure that will be used to store and manage end-user data.
- A well-designed database facilitates efficient data management and generates accurate, valuable information.
- A poorly designed database can lead to errors that are difficult to trace, potentially resulting in poor decision-making.
Evolution of File System Data Processing
- Manual file systems: Accomplished through a system of file folders and filing cabinets.
- Computerized file systems: Data processing (DP) specialists created computer-based systems to track data and produce required reports.
- File system redux: modern end-user productivity tools, including spreadsheet programs like Microsoft Excel.
Basic File Terminology
| TERM | DEFINITION |
|---|---|
| Data | Raw facts, such as a telephone number, a birth date, a customer name, and a year-to-date (YTD) sales value. Data has little meaning unless it has been organized in some logical manner. |
| Field | A character or group of characters (alphabetic or numeric) that has a specific meaning. A field is used to define and store data. |
| Record | A logically connected set of one or more fields that describes a person, place, or thing. |
| File | A collection of related records. |
Problems with File System Data Processing
- Problems include:
- Lengthy development times
- Difficulty obtaining quick answers
- Complex system administration
- Lack of security and limited data sharing
- Extensive programming requirements
Structural and Data Dependence
- Structural dependence: Access to a file is dependent on its structure; file system programs must be modified to conform to new file structures.
- Structural independence: File structure can be changed without affecting the application’s ability to access the data.
- Data dependence: Data access changes when data storage characteristics change.
- Data independence: Data storage characteristics can be changed without affecting the program’s ability to access the data.
- The practical significance of data dependence is the difference between logical and physical format.
Data Redundancy
- Unnecessarily storing the same data in different places, leading to scattered data locations (islands of information).
- Increases the probability of having inconsistent versions of the same data.
- Possible results of uncontrolled data redundancy include:
- Poor data security
- Data inconsistency
- Data-entry errors
- Data integrity problems
Data Anomalies
- Develop when not all of the required changes in the redundant data are made successfully
- Types of anomalies:
- Update anomalies
- Insertion anomalies
- Deletion anomalies
Database Systems
- Logically related data stored in a single logical data repository but physically distributed among multiple storage facilities.
- DBMS eliminates most of the file system's data inconsistency, data anomaly, data dependence, and structural dependence problems.
- Current generation DBMS software:
- Stores data structures, relationships between structures, and access paths.
- Defines, stores, and manages all access paths and components.
The Database System Environment
- Defined as the organization of components that define and regulate the collection, storage, management, and use of data within a database environment.
- Composed of:
- Hardware
- Software
- People
- Procedures
- Data
DBMS Functions
- Data dictionary management: Stores definitions of data elements and their relationships.
- Data storage management: Performance tuning ensures efficient performance.
- Data transformation and presentation: Data is formatted to conform to logical expectations.
- Security management: Enforces user security and data privacy.
- Multiuser access control: Sophisticated algorithms ensure that multiple users can access the database concurrently without compromising its integrity.
- Backup and recovery management: Enables recovery of the database after a failure.
- Data integrity management: Minimizes redundancy and maximizes consistency.
- Database access languages and application programming interfaces:
- The query language lets the user specify what must be done without having to specify how.
- Structured Query Language (SQL) is the standard query language supported by most DBMS vendors.
- The query language lets the user specify what must be done without having to specify how.
- Database communication interfaces: Accept end-user requests via multiple different network environments.
Managing the Database System: A Shift in Focus
- Disadvantages of database systems:
- Increased costs
- Management complexity
- Maintaining currency
- Vendor dependence
- Frequent upgrade/replacement cycles
Preparing for Your Database Professional Career
| JOB TITLE | DESCRIPTION | SAMPLE SKILLS REQUIRED |
|---|---|---|
| Database Developer | Create and maintain database-based applications | Programming, database fundamentals, SQL |
| Database Designer | Design and maintain databases | Systems design, database design, SQL |
| Database Administrator | Manage and maintain DBMS and databases | Database fundamentals, SQL, vendor courses |
| Database Analyst | Develop databases for decision support reporting | SQL, query optimization, data warehouses |
| Database Architect | Design and implementation of database environments (conceptual, logical, physical) | DBMS fundamentals, data modeling, SQL, hardware knowledge, etc. |
| Database Consultant | Help companies leverage database technologies to improve business processes | Database fundamentals, data modeling, database design, SQL, DBMS, hardware, vendor-specific technologies, etc. |
| Database Security Officer | Implement security policies for data administration | DBMS fundamentals, database administration, SQL, data security technologies, etc. |
| Cloud Computing Data Architect | Design and implement the infrastructure for next-generation cloud database systems | Internet technologies, cloud storage technologies, data security, performance tuning, large databases, etc. |
| Data Scientist | Analyze large amounts of varied data to generate insights and predictable behaviors | Data analysis, statistics, advanced mathematics, SQL, programming, data mining, machine learning, data visualization. |
Summary
- Data consists of raw facts and is usually stored in a database.
- Database design defines the structure, and database types can be classified according to number of users, location, data usage, and structure.
- Databases evolved from manual and computerized file systems, but file system data management has limitations.
- DBMSs were developed to address the file system's inherent weaknesses.