Course Title: Management Information Systems
Week: 3 - 2
Date: February 5, 2025
Instructor: Yuyang Ye
Contact: yuyang.ye@rutgers.edu
Definition: A database is a collection of data in a structured format.
Examples: Personal experiences where data is organized, such as spreadsheets or contact lists.
Features:
Modern databases are stored electronically on computers.
Standardizes storage of similar data.
Can contain trillions of bytes of data.
Supports thousands of simultaneous users.
Definition: Data Management involves organizing and maintaining data within and across database systems.
Operational Efficiencies:
Adding new information.
Updating existing data.
Erasing outdated or unnecessary data.
Storage:
Data is typically stored in relational database tables for ease of access and management.
Well-organized databases ensure smooth interactions and data integrity across systems.
Definition: DBMS is software that manages data in a database.
Types of Databases:
Relational Databases
NoSQL Databases
Cloud Databases
Functions: Ensures data security, consistency, and availability.
Complexity: Managing large databases with many users adds complexity to DBMS.
Considerations:
Number of tables in a database.
Types of data each table should store.
How tables are linked (relationships).
Example: Designing a library database (tables for books, members, loans).
System-level Questions:
How many systems an organization needs.
Functions and information each system contains.
How systems interact.
Increasing Data Volume: Organizations are accumulating massive amounts of data.
Historical data retention and slow data deletion rates contribute to increased volume.
Accelerating Data Growth: New technologies make data collection easier and faster, leading to accelerated accumulation.
Data Gathering: Organizations need to decide which types of data to collect (e.g., clickstream data from websites).
Data Value: Just because data accumulates quickly, does not mean it is valuable.
Organizations may collect data without a clear purpose first, then seek its value.
Traditional Approach: Data was traditionally gathered in response to specific operational questions or problems.
Data Fusion: The act of combining data from multiple sources, which can enhance operations (e.g., sharing inventory data).
Data Quality Issues: Organizations often face data accuracy problems due to redundancy, poor system coordination, and insufficient error checks.
Data Security: The protection of data against unauthorized deletion, corruption, or misuse. Threats can be intentional (hacking) or unintentional.
Definition: Involves tracking day-to-day events, such as orders and accounting entries.
Characteristics:
Involves ongoing updates to data tables (adding/modifying/deleting rows).
Example actions include adding new customer orders and modifying customer information.
Definition: Combines data from many or all rows to derive insights via complex queries and summarization.
Examples: Total sales computation, summarizing costs by category.
Types of Processing:
Classic MIS: Basic grouping and summarization.
Advanced: Data mining for pattern discovery using statistical techniques.
Definition: Utilizes data for complex decision formulation (e.g., logistics planning for shipments).
Limitations:
Simple systems like Access can manage basic decisions but cannot perform data mining independently without additional software integration.
Issues:
High transaction volumes may hinder analytical processing and vice versa.
Analytical processes may cause data inconsistencies if data changes during calculation.
Solution: Create a data warehouse for analytical processing without disrupting transaction processes.
Example: Consider a simple loan database with one table—LOAN, containing multiple attributes.
Issues: High potential for data redundancy, update anomalies, and lack of normalization.
Data Redundancy:
Problem: Repetitive customer information leads to increased risk of inconsistencies.
Update Anomalies:
Problem: Different addresses may exist for the same customer across records.
Insertion Anomalies:
Problem: Difficulties in adding new data without existing examples (e.g., a customer without a loan).
Deletion Anomalies:
Problem: Losing important customer information after deleting their last loan record.
Single-table databases result in:
Data redundancy
Update anomalies
Insertion anomalies
Deletion anomalies
Solution: Normalize with multiple related tables to overcome these issues.
Definition: Allows multiple fields for repeated data to avoid anomalies.
Drawbacks: Fixed maximum number of repeats can lead to excessive storage and may not fit varying data needs.
Multiple Tables Approach:
Customer Table: Stores customer information.
Loan Table: Stores loan information.
Using foreign keys to link tables effectively eliminates redundancy and anomalies, ensuring single storage of customer information.
Definition: Entity-Relationship (ER) modeling visualizes relationships between entities in a database.
Example: Customers and loans in a one-to-many relationship.
Customer Entity Attributes:
Examples: CustomerID, FirstName, LastName, Address, etc.
Loan Entity Attributes:
Examples: LoanID, Date, Amount, Rate, etc.
Relationship representation: Each Loan references a Customer via CustomerID (foreign key).
Detailed attributes of loan records displayed, demonstrating the relationship with the customer.
Definition: Specification of database objects such as tables, columns, data types, and indexes.
Design Process: Involves conceptual, logical, and physical design phases for large databases.
Components:
Entities: Represented as rectangles with rounded corners (e.g., Person, place, activity).
Relationships: Lines that define connections between entities.
Attributes: Descriptive properties of entities.
ER Diagrams: Visual representation of entities, relationships, and attributes supplemented by textual descriptions.
Examples of ER diagrams presented to illustrate relationships and entities in database design.