DATA AND INFORMATION MANAGEMENT
Data and Information Management: Database Management (Gr 11, 12)
Collecting Data
• Definition: The process of gathering information from various sources for storage, analysis, and decision-making.
• Function: Forms the foundation for creating databases and making informed decisions.
• Examples: Surveys, online forms, sensors.
Data Warehouse and Data Mining
• Data Warehouse:
○ Definition: A large storage system that consolidates data from multiple sources for analysis and reporting.
○ Advantages: Centralizes data, supports complex queries, improves decision-making.
○ Examples: Amazon Redshift, Google BigQuery.
• Data Mining:
○ Definition: The process of analyzing large datasets to discover patterns, correlations, and trends.
○ Advantages: Helps in predictive analysis, customer segmentation, fraud detection.
○ Examples: Market basket analysis, customer behavior prediction.
Database Management System (DBMS)
• Definition: Software that provides an interface for users to create, manage, and interact with databases.
• Function: Manages data efficiently, ensures data integrity, and supports multiple users.
• Examples: MySQL, Oracle, Microsoft SQL Server.
Database Types
• Desktop/Personal Database:
○ Definition: A database that runs on a personal computer and is used by a single user.
○ Examples: Microsoft Access.
• Server/Centralized Database:
○ Definition: A database hosted on a server, accessed by multiple users over a network.
○ Examples: Oracle Database, SQL Server.
• Distributed Database:
○ Definition: A database where data is stored across multiple locations but appears as a single database to users.
○ Examples: Apache Cassandra, MongoDB.
Blockchain
• Definition: A decentralized ledger technology that records transactions across multiple computers securely.
• Advantages: Enhances transparency, security, and traceability.
• Examples: Bitcoin, Ethereum.
Database-Related Careers
• Database Administrator (DBA):
○ Function: Manages and maintains database systems.
○ Skills: Data security, backup, and recovery, performance tuning.
• Database Programmers:
○ Function: Develops and writes programs to manage and manipulate data within a database.
○ Skills: SQL, database scripting.
• Database Analysts:
○ Function: Analyzes data to support decision-making processes.
○ Skills: Data modeling, statistical analysis.
• Database Project Managers:
○ Function: Manages database projects, ensuring they are delivered on time and within budget.
○ Skills: Project management, team leadership.
Data Collection Methods
• Online Forms: Digital forms used to collect data via the internet.
• RFID: Radio-frequency identification used for tracking items.
• Digital Sensors: Devices that collect data from the physical environment.
• Invisible Data Collection: Data collection without explicit user input, like cookies.
• Digital Footprint: The trail of data left by users' interactions online.
• Transaction Tracking: Monitoring and recording the details of transactions.
Data Warehousing vs. Data Mining
• Data Warehousing:
○ Function: Storing large volumes of structured data for analysis.
• Data Mining:
○ Function: Extracting useful patterns and knowledge from large datasets.
Computer Systems for Problem Solving
• DSS (Decision Support System):
○ Definition: Computer-based systems that support business or organizational decision-making activities.
○ Examples: Financial planning software.
• Expert System:
○ Definition: A computer system that mimics human expertise to solve specific problems.
○ Examples: Medical diagnosis systems.
Data Security
• Record Lock: Prevents simultaneous access to a record to avoid data conflicts.
• Audit Trail: A record of all changes made to a database for security and accountability.
• Rollback: Reverting a database to a previous state in case of errors.
• Parallel Data Sets: Duplicates of data to ensure availability and reliability.
Database Design Concepts (Gr 11, 12)
Relationship between Data, Information, Knowledge, and Decision-Making
• Data: Raw facts and figures without context.
• Information: Data processed to be meaningful.
• Knowledge: Information interpreted based on experience or learning.
• Decision-Making: The process of making choices by analyzing information and knowledge.
Characteristics of Quality Data
• Accuracy: Data should be free from errors.
• Correctness: Data should be correct and reliable.
• Currency: Data should be up-to-date.
• Completeness: Data should be complete and contain all necessary information.
• Relevance: Data should be relevant to the context and purpose.
Verification and Validation Techniques
• Verification: Ensures data is correctly entered and consistent.
• Validation:
○ Format Check: Ensures data is in the correct format.
○ Data Type Check: Ensures data is of the correct type (e.g., numerical, textual).
○ Range Check: Ensures data falls within a specified range.
○ Check Digit: A digit added to the end of a number to validate its authenticity.
Database Design
• Elements of a Database: Tables, fields, records, and relationships.
• Field Data Types: Specifies the type of data stored in a field (e.g., integer, text, date).
• Tables and Relationships: Organizes data into tables, with relationships defining how tables relate to each other.
• Keys:
○ Primary Key: Uniquely identifies a record in a table.
○ Foreign Key: A field in one table that links to the primary key in another table.
• Entity Relationship Diagram (ERD): Visual representation of entities and their relationships in a database.
Characteristics of a Good Database
• Data Integrity: Ensures the accuracy and consistency of data.
• Data Independence: Data is independent of the applications that use it.
• Data Security: Protects data from unauthorized access or alterations.
• Data Maintenance: Ensures data is regularly updated and maintained.
Problems with Databases (Anomalies)
• Types:
○ Insertion Anomaly: Difficulty adding data to the database.
○ Update Anomaly: Inconsistencies arise when updating data.
○ Deletion Anomaly: Unintended loss of data when deleting records.
• Solutions: Normalization, proper database design.
Preventing Anomalies (Normalization)
• Definition: The process of organizing data to reduce redundancy and improve integrity.
• Entities as Tables: Each entity is represented as a table.
• Multiple Values in a Field: Avoid storing multiple values in a single field.
• Redundant Fields: Eliminate unnecessary duplication of data.
Entity Integrity
• Definition: Ensures each table has a primary key and that the values are unique and not null.
Relational Database
• Definition: A database structured to recognize relations among stored data.
• Referential Integrity: Ensures that relationships between tables remain consistent.
Social Implications (Gr 10, 11, 12)
Computer and Society
• Impact: Computers have transformed communication, education, business, and daily life.
• Ethics: Addresses issues like privacy, intellectual property, and the digital divide.