Databases & data management

Principles of Information Systems, Chapter 5: Databases and Data Management

Chapter Objectives

By the end of this chapter, you should be able to:

  • Identify the benefits of quality data and information.

  • Distinguish between data management and data governance.

  • Explore the fundamentals of databases.

  • Identify the characteristics of a relational database.

Why Learn about Databases and Data Management?

  • Databases capture and report consumer data.

  • Business data helps leaders to:

    • Understand operations within an organization.

    • Identify operating issues.

    • Examine causes of issues.

    • Identify business opportunities.

  • Understanding database systems and data management is essential for effective business operations.

Benefits of Quality Data and Information (1 of 3)

  • Characteristics of Quality Data and Information:

    • Accessible: Data is easy to retrieve and use.

    • Accurate: Data must be correct and precise.

    • Complete: All necessary data is present.

    • Economical: Cost-effective in terms of collection and storage.

    • Relevant: Data must pertain to the subject at hand.

    • Reliable: Data can be trusted.

    • Secure: Protection from unauthorized access.

    • Timely and Verifiable: Data must be available when needed and verifiable for credibility.

Benefits of Quality Data and Information (2 of 3)

  • Improves Decision Making:

    • High-quality data reduces guesswork and risk.

  • Increases Customer Satisfaction:

    • Quality data enables personalization in customer service.

  • Increases Sales:

    • Provides accurate targeting and communication with consumers.

Benefits of Quality Data and Information (3 of 3)

  • Enhances Innovation:

    • Improves efficiency, product/service quality, and customer experience.

  • Raises Productivity:

    • Employees can focus on core tasks rather than correcting errors.

  • Ensures Compliance:

    • Adherence to regulations concerning data management is essential.

Data Management (1 of 5)

  • Definition of Data Management:

    • An integrated set of functions defining the processes for obtaining, certifying, storing, securing, and processing data.

    • Ensures data is accessible, reliable, and timely for organizational needs.

  • Data Governance:

    • Defines roles, responsibilities, and processes to ensure trust and usability of data.

    • Requires business leadership and active participation for effectiveness.

Data Management (2 of 5)

  • Functions of Data Management:

    • Basic functions identified by the Data Management Association International are important for effective data management strategies.

Data Management (3 of 5)

  • Role of a Database Administrator (DBA):

    • Skilled IS professional responsible for:

    • Discussing data needs with business users.

    • Applying programming languages to create databases.

    • Testing and evaluating databases for performance.

    • Monitoring performance and implementing changes.

    • Ensuring data security against unauthorized access.

Data Management (4 of 5)

  • Data Steward:

    • Typically a non-IS employee responsible for managing critical data entities.

    • Responsibilities include:

    • Identifying and acquiring new data sources.

    • Maintaining data definitions and analyzing data quality.

    • Reconciling data issues effectively.

  • Data Lifecycle Management (DLM):

    • A policy-based approach managing data flow from acquisition or creation to deletion of outdated data.

Data Management (5 of 5)

  • Data Lifecycle indicated:

    • Policy-based management ensuring that data is efficiently handled from creation to deletion.

Knowledge Check Activity 5-1

Which of the following is NOT a key responsibility of the data governance team?

  • a. Develop accountability policies for data aspects.

  • b. Determine required database technology (Correct Answer).

  • c. Define data storage, archiving, and protection processes.

  • d. Develop access and update standards.

Database Fundamentals

  • Contribution to Organizational Success:

    • Provides timely, accurate, and relevant information for decision-making.

  • Benefits of Databases:

    • Help reduce costs and increase profits.

    • Aid in acquiring new customers and tracking past activities.

    • Enhance customer service and market opportunity identification.

The Database Approach (1 of 2)

  • Database Approach to Data Management:

    • Many organizations use this approach where multiple systems share related data.

  • Database Management System (DBMS):

    • Software for accessing and managing databases and interfaces between the database and users or applications.

    • Provides centralized management and control over data resources.

The Database Approach (2 of 2)

  • Figure 5-4 illustrates the database approach showing shared data across multiple information systems.

Database Design (1 of 7)

  • Definitions:

    • Entity: A person, place, or thing for which data is collected (e.g., employee).

    • Attribute: A characteristic of an entity (e.g., employee number, last name, first name, hire date).

Database Design (2 of 7)

  • Design Principles:

    • A database should efficiently store relevant data and allow for quick access and modifications.

    • Considerations include:

    • Content and access.

    • Logical structure and physical organization.

    • Response time, archiving, and security.

Database Design (3 of 7)

  • Data Modeling:

    • A tool used for designing databases.

    • Can occur at organizational or project/application levels.

    • Enterprise Data Modeling:

    • Investigates organizational needs at a strategic level before examining functional area specifics.

Database Design (4 of 7)

  • Purpose of Enterprise Data Model:

    • Identifies data entities and attributes of interest.

    • Establishes standard data definitions, lengths, formats, valid value domains, and associated business rules.

Database Design (5 of 7)

  • Figure 5-5 illustrates the roadmap provided by the enterprise data model for building databases and information systems.

Database Design (6 of 7)

  • Entity-Relationship (ER) Diagram:

    • A data model used to analyze and communicate data needs at project-level.

    • Uses graphical symbols to represent data entities, attributes, and relationships.

    • Various notation styles exist for drawing ER diagrams.

Database Design (7 of 7)

  • Figure 5-6 shows an ER diagram for a customer order database, emphasizing the importance of logical consistency in application programming.

The Data Hierarchy (1 of 2)

  • Definitions in Data Hierarchy:

    • Field: A single attribute about an entity.

    • Record: A collection of fields about a specific entity.

    • Primary Key: A field or set that uniquely identifies a record in a database table.

    • Foreign Key: A field in one table that refers to the primary key in another table.

The Data Hierarchy (2 of 2)

  • Figure 5-7 illustrates how each level of the hierarchy adds meaning and context to the data structure.

Database Activities (1 of 6)

  • Structured Query Language (SQL):

    • A special-purpose programming language for accessing and manipulating database data.

  • Providing a User View:

    • The schema defines tables, fields in tables, and relationships.

    • DBMS references schemas for data access.

    • Users utilize Data Query Language (DQL) to select data.

Database Activities (2 of 6)

  • Figure 5-8 provides a visual representation of a database schema, showing the database objects and inter-relationships.

Database Activities (3 of 6)

  • Creating and Modifying Databases:

    • Data Definition Language (DDL):

    • Instructions and commands used to define data and relationships.

    • Data Dictionary:

    • Collection of metadata describing data, format, structure, and relationships.

    • Metadata: Descriptive data about other data.

Database Activities (4 of 6)

  • Storing and Retrieving Data:

    • DBMS acts as an interface to applications requesting data.

    • Data Control Language (DCL):

    • Subset of SQL managing user privileges.

    • GRANT command: Defines user privileges.

Database Activities (5 of 6)

  • Manipulating Data:

    • Data Manipulation Language (DML):

    • Used for modifying database data.

    • Transaction Control Language (TCL):

    • Maintains integrity by committing changes or rolling back errors.

Database Activities (6 of 6)

  • Security Management:

    • Ensures data protection from unauthorized access and damages.

    • Defines user access privileges.

  • Backup and Recovery:

    • Provides backup services allowing database copies to be created securely.

Data Cleansing

  • Definition of Data Cleansing:

    • Process of detecting and correcting or deleting records that are incomplete, incorrect, inaccurate, or irrelevant.

  • Difference from Data Validation:

    • Data cleansing identifies inaccuracies with cross-checks against validated datasets.

  • Data Enhancement:

    • Adds related information to existing data in a database.

Relational Databases (1 of 2)

  • Characteristics of a Relational Database:

    • Organized into relations with rows as entities and columns as attributes.

    • Rows uniquely identified by primary key.

    • Data types include integers, decimals, dates, texts, etc.

    • Tables allow data constraints (type, length, values).

Relational Databases (2 of 2)

  • Relations and User Queries:

    • Primary and foreign keys define table relationships.

    • User queries perform database operations like adding, changing, or selecting data.

Manipulating Data in a Relational Database (1 of 4)

  • Data Operations:

    • Selecting: Eliminating rows based on criteria.

    • Projecting: Eliminating columns in tables.

    • Joining: Connecting tables through common fields.

  • Advantage of Relational Databases:

    • Linking reduces data redundancy.

Manipulating Data in a Relational Database (2 of 4)

  • Figure 5-10 illustrates the two-dimensional table model within the relational database structure.

Manipulating Data in a Relational Database (3 of 4)

  • Figure 5-12 displays linking of data tables required to answer specific inquiries, such as identifying a manager for a project using three tables.

Manipulating Data in a Relational Database (4 of 4)

  • Figure 5-13 depicts tools available for building and modifying a relational database.

Popular Relational Database Management Systems (1 of 2)

  • Database as a Service (DaaS):

    • Data is stored on service provider’s servers, accessed over the internet.

    • Administration handled by the service provider, eliminating in-house maintenance.

Popular Relational Database Management Systems (2 of 2)

  • Open-Source and Proprietary Systems:

    • Popular systems include:

    • Open-Source: MySQL, PostgreSQL, MariaDB, SQLite, CouchDB.

    • For Individuals/Workgroups: Microsoft Access, Google Base, IBM Lotus Approach.

    • For Workgroups/Enterprises: Oracle, IBM DB2, Sybase Adaptive Server, Microsoft SQL Server.

Knowledge Check Activity 5-2

  • Question: A collection of commands to define data and relationships in a database is a _.

    • a. database schema

    • b. data definition language (Correct Answer)

    • c. data model

    • d. data manipulation language

Self Assessment

  • Questions:

    • How can quality registry data benefit medical professionals and patients?

    • What are the potential consequences of incomplete or inaccurate registry data?

Summary

By the end of this chapter, you should be able to:

  • Identify the benefits of quality data and information.

  • Distinguish between data management and data governance.

  • Explore the fundamentals of databases.

  • Identify the characteristics of a relational database.