Reference: Bidgoli, MIS, 11th Edition (© 2024 Cengage)
By the end of this module, you should be able to:
3.1 Define a database and a DBMS.
3.2 Explain logical database design and the relational database model.
3.3 Define the five components of a database management system.
3.4 Summarize three recent trends in database design and use.
3.5 Analyze the four major components and functions of a data warehouse and their use for business.
3.6 Describe the functions of a data mart.
3.7 Compare and contrast data lakes with data warehouses.
3.8 Describe the role of business analytics in the decision-making process.
3.9 Examine the advantages and challenges of big data and predictive analytics for a business.
3.10 Explain database marketing and its business applications.
3.11 Explain key features of Tableau and Power BI as two popular business intelligence and visualization platforms.
Definition: Collection of related data stored centrally or across multiple locations.
Data Hierarchy:
Structure and organization of data including fields, records, and files.
DBMS: Software for creating, storing, maintaining, and accessing database files.
Internal Data:
Stored within the organization’s databases for functional information systems.
Examples: Transactions, sales, personnel records.
External Data:
Stored in data warehouses.
Examples: Competitors, customers, tax records.
Sequential Access File Structure:
Organized and processed numerically or sequentially based on a primary key.
Suitable for backup and archived files, typically on magnetic tape.
Random Access File Structure:
Records accessed in any order regardless of physical location.
Fast processing, typically stored on magnetic disks.
Indexed Sequential Access Method (ISAM):
Records accessed either sequentially or randomly, based on holdings.
Uses an index structure comprising indexed value and pointer to disk location.
Viewed in two ways:
Physical View: How data is stored and retrieved (e.g., hard disks).
Logical View: Appearance and organization of data viewed by users.
Data Model: Determines data creation, representation, organization, and maintenance.
Includes:
Data structure (organization)
Operations (methods, calculations)
Integrity rules (boundaries)
Hierarchical Model:
Relationships between records organized as branches.
Network Model:
Similar to the hierarchical model but allows multiple parent and child records.
Relational Model:
Utilizes a two-dimensional table of rows (records) and columns (fields).
Data Dictionary: Stores definitions of fields, types, defaults, and validation rules.
Keys:
Primary Key: Uniquely identifies records.
Foreign Key: Links child table to parent table.
Normalization: Eliminates redundant data through defined normal forms (1NF to 5NF).
DBMS Software Components:
Database Engine: Storage and retrieval of data.
Data Definition: Creates and manages data dictionaries.
Data Manipulation: Adds, deletes, modifies, and retrieves records.
Application Generation: Designs application interface elements.
Data Administration: Security, backup, recovery, and CRUD permissions.
Database Administrators (DBAs):
Responsible for design, management, security, recovery, and performance.
Interface to databases for:
Data retrieval
User data entry
Benefits include improved access and reduced maintenance overhead.
DDBMS:
Stores data across multiple servers.
Advantages include better reflective design, reduced response time, and fault tolerance.
Setup Approaches:
Fragmentation: Tables divided by location.
Replication: Sites store data copies.
Allocation: Combination of fragmentation and replication.
Objects contain data and related methods.
Advantages:
Supports complex data management and multimedia.
Definition: Collection of data from various sources for decision-making support.
Characteristics:
Subject-oriented, integrated, time-variant, and focused on analytical purpose.
ETL Process:
Extraction, Transformation, and Loading of data.
Generates reports via OLTP and OLAP for decision-making.
Data mining for pattern discovery and text mining for analyzing textual information.
Benefits include fast query responses and the ability to analyze historical data.
Definition: Smaller, focused versions of data warehouses.
Advantages: Faster access and improved response time.
Disadvantages: Limited scope and consolidation difficulties.
Stores both structured and unstructured data for analytics and machine learning.
Utilizes data and statistical methods to provide actionable insights.
Methods include:
Descriptive: Reviews past data.
Predictive: Prepares for future events.
Prescriptive: Suggests likely outcomes for decisions.
Definition: Characterized by high volume, variety, velocity, veracity, and value (5 Vs).
Privacy risks include discrimination and loss of anonymity.
Leverages an organization’s customer database for proactive marketing initiatives.
Successful tasks include Customer Lifetime Value (CLTV) and RFM analysis.
Reaffirmation of objectives covered in the module.