Databases & data management
Principles of Information Systems, Chapter 5: Databases and Data Management
Chapter Objectives
By the end of this chapter, you should be able to:
Identify the benefits of quality data and information.
Distinguish between data management and data governance.
Explore the fundamentals of databases.
Identify the characteristics of a relational database.
Why Learn about Databases and Data Management?
Databases capture and report consumer data.
Business data helps leaders to:
Understand operations within an organization.
Identify operating issues.
Examine causes of issues.
Identify business opportunities.
Understanding database systems and data management is essential for effective business operations.
Benefits of Quality Data and Information (1 of 3)
Characteristics of Quality Data and Information:
Accessible: Data is easy to retrieve and use.
Accurate: Data must be correct and precise.
Complete: All necessary data is present.
Economical: Cost-effective in terms of collection and storage.
Relevant: Data must pertain to the subject at hand.
Reliable: Data can be trusted.
Secure: Protection from unauthorized access.
Timely and Verifiable: Data must be available when needed and verifiable for credibility.
Benefits of Quality Data and Information (2 of 3)
Improves Decision Making:
High-quality data reduces guesswork and risk.
Increases Customer Satisfaction:
Quality data enables personalization in customer service.
Increases Sales:
Provides accurate targeting and communication with consumers.
Benefits of Quality Data and Information (3 of 3)
Enhances Innovation:
Improves efficiency, product/service quality, and customer experience.
Raises Productivity:
Employees can focus on core tasks rather than correcting errors.
Ensures Compliance:
Adherence to regulations concerning data management is essential.
Data Management (1 of 5)
Definition of Data Management:
An integrated set of functions defining the processes for obtaining, certifying, storing, securing, and processing data.
Ensures data is accessible, reliable, and timely for organizational needs.
Data Governance:
Defines roles, responsibilities, and processes to ensure trust and usability of data.
Requires business leadership and active participation for effectiveness.
Data Management (2 of 5)
Functions of Data Management:
Basic functions identified by the Data Management Association International are important for effective data management strategies.
Data Management (3 of 5)
Role of a Database Administrator (DBA):
Skilled IS professional responsible for:
Discussing data needs with business users.
Applying programming languages to create databases.
Testing and evaluating databases for performance.
Monitoring performance and implementing changes.
Ensuring data security against unauthorized access.
Data Management (4 of 5)
Data Steward:
Typically a non-IS employee responsible for managing critical data entities.
Responsibilities include:
Identifying and acquiring new data sources.
Maintaining data definitions and analyzing data quality.
Reconciling data issues effectively.
Data Lifecycle Management (DLM):
A policy-based approach managing data flow from acquisition or creation to deletion of outdated data.
Data Management (5 of 5)
Data Lifecycle indicated:
Policy-based management ensuring that data is efficiently handled from creation to deletion.
Knowledge Check Activity 5-1
Which of the following is NOT a key responsibility of the data governance team?
a. Develop accountability policies for data aspects.
b. Determine required database technology (Correct Answer).
c. Define data storage, archiving, and protection processes.
d. Develop access and update standards.
Database Fundamentals
Contribution to Organizational Success:
Provides timely, accurate, and relevant information for decision-making.
Benefits of Databases:
Help reduce costs and increase profits.
Aid in acquiring new customers and tracking past activities.
Enhance customer service and market opportunity identification.
The Database Approach (1 of 2)
Database Approach to Data Management:
Many organizations use this approach where multiple systems share related data.
Database Management System (DBMS):
Software for accessing and managing databases and interfaces between the database and users or applications.
Provides centralized management and control over data resources.
The Database Approach (2 of 2)
Figure 5-4 illustrates the database approach showing shared data across multiple information systems.
Database Design (1 of 7)
Definitions:
Entity: A person, place, or thing for which data is collected (e.g., employee).
Attribute: A characteristic of an entity (e.g., employee number, last name, first name, hire date).
Database Design (2 of 7)
Design Principles:
A database should efficiently store relevant data and allow for quick access and modifications.
Considerations include:
Content and access.
Logical structure and physical organization.
Response time, archiving, and security.
Database Design (3 of 7)
Data Modeling:
A tool used for designing databases.
Can occur at organizational or project/application levels.
Enterprise Data Modeling:
Investigates organizational needs at a strategic level before examining functional area specifics.
Database Design (4 of 7)
Purpose of Enterprise Data Model:
Identifies data entities and attributes of interest.
Establishes standard data definitions, lengths, formats, valid value domains, and associated business rules.
Database Design (5 of 7)
Figure 5-5 illustrates the roadmap provided by the enterprise data model for building databases and information systems.
Database Design (6 of 7)
Entity-Relationship (ER) Diagram:
A data model used to analyze and communicate data needs at project-level.
Uses graphical symbols to represent data entities, attributes, and relationships.
Various notation styles exist for drawing ER diagrams.
Database Design (7 of 7)
Figure 5-6 shows an ER diagram for a customer order database, emphasizing the importance of logical consistency in application programming.
The Data Hierarchy (1 of 2)
Definitions in Data Hierarchy:
Field: A single attribute about an entity.
Record: A collection of fields about a specific entity.
Primary Key: A field or set that uniquely identifies a record in a database table.
Foreign Key: A field in one table that refers to the primary key in another table.
The Data Hierarchy (2 of 2)
Figure 5-7 illustrates how each level of the hierarchy adds meaning and context to the data structure.
Database Activities (1 of 6)
Structured Query Language (SQL):
A special-purpose programming language for accessing and manipulating database data.
Providing a User View:
The schema defines tables, fields in tables, and relationships.
DBMS references schemas for data access.
Users utilize Data Query Language (DQL) to select data.
Database Activities (2 of 6)
Figure 5-8 provides a visual representation of a database schema, showing the database objects and inter-relationships.
Database Activities (3 of 6)
Creating and Modifying Databases:
Data Definition Language (DDL):
Instructions and commands used to define data and relationships.
Data Dictionary:
Collection of metadata describing data, format, structure, and relationships.
Metadata: Descriptive data about other data.
Database Activities (4 of 6)
Storing and Retrieving Data:
DBMS acts as an interface to applications requesting data.
Data Control Language (DCL):
Subset of SQL managing user privileges.
GRANT command: Defines user privileges.
Database Activities (5 of 6)
Manipulating Data:
Data Manipulation Language (DML):
Used for modifying database data.
Transaction Control Language (TCL):
Maintains integrity by committing changes or rolling back errors.
Database Activities (6 of 6)
Security Management:
Ensures data protection from unauthorized access and damages.
Defines user access privileges.
Backup and Recovery:
Provides backup services allowing database copies to be created securely.
Data Cleansing
Definition of Data Cleansing:
Process of detecting and correcting or deleting records that are incomplete, incorrect, inaccurate, or irrelevant.
Difference from Data Validation:
Data cleansing identifies inaccuracies with cross-checks against validated datasets.
Data Enhancement:
Adds related information to existing data in a database.
Relational Databases (1 of 2)
Characteristics of a Relational Database:
Organized into relations with rows as entities and columns as attributes.
Rows uniquely identified by primary key.
Data types include integers, decimals, dates, texts, etc.
Tables allow data constraints (type, length, values).
Relational Databases (2 of 2)
Relations and User Queries:
Primary and foreign keys define table relationships.
User queries perform database operations like adding, changing, or selecting data.
Manipulating Data in a Relational Database (1 of 4)
Data Operations:
Selecting: Eliminating rows based on criteria.
Projecting: Eliminating columns in tables.
Joining: Connecting tables through common fields.
Advantage of Relational Databases:
Linking reduces data redundancy.
Manipulating Data in a Relational Database (2 of 4)
Figure 5-10 illustrates the two-dimensional table model within the relational database structure.
Manipulating Data in a Relational Database (3 of 4)
Figure 5-12 displays linking of data tables required to answer specific inquiries, such as identifying a manager for a project using three tables.
Manipulating Data in a Relational Database (4 of 4)
Figure 5-13 depicts tools available for building and modifying a relational database.
Popular Relational Database Management Systems (1 of 2)
Database as a Service (DaaS):
Data is stored on service provider’s servers, accessed over the internet.
Administration handled by the service provider, eliminating in-house maintenance.
Popular Relational Database Management Systems (2 of 2)
Open-Source and Proprietary Systems:
Popular systems include:
Open-Source: MySQL, PostgreSQL, MariaDB, SQLite, CouchDB.
For Individuals/Workgroups: Microsoft Access, Google Base, IBM Lotus Approach.
For Workgroups/Enterprises: Oracle, IBM DB2, Sybase Adaptive Server, Microsoft SQL Server.
Knowledge Check Activity 5-2
Question: A collection of commands to define data and relationships in a database is a _.
a. database schema
b. data definition language (Correct Answer)
c. data model
d. data manipulation language
Self Assessment
Questions:
How can quality registry data benefit medical professionals and patients?
What are the potential consequences of incomplete or inaccurate registry data?
Summary
By the end of this chapter, you should be able to:
Identify the benefits of quality data and information.
Distinguish between data management and data governance.
Explore the fundamentals of databases.
Identify the characteristics of a relational database.