DATA AND INFORMATION MANAGEMENT

Data and Information Management: Database Management (Gr 11, 12)

Collecting Data

• Definition: The process of gathering information from various sources for storage, analysis, and decision-making.

• Function: Forms the foundation for creating databases and making informed decisions.

• Examples: Surveys, online forms, sensors.

Data Warehouse and Data Mining

• Data Warehouse:

○ Definition: A large storage system that consolidates data from multiple sources for analysis and reporting.

○ Advantages: Centralizes data, supports complex queries, improves decision-making.

○ Examples: Amazon Redshift, Google BigQuery.

• Data Mining:

○ Definition: The process of analyzing large datasets to discover patterns, correlations, and trends.

○ Advantages: Helps in predictive analysis, customer segmentation, fraud detection.

○ Examples: Market basket analysis, customer behavior prediction.

Database Management System (DBMS)

• Definition: Software that provides an interface for users to create, manage, and interact with databases.

• Function: Manages data efficiently, ensures data integrity, and supports multiple users.

• Examples: MySQL, Oracle, Microsoft SQL Server.

Database Types

• Desktop/Personal Database:

○ Definition: A database that runs on a personal computer and is used by a single user.

○ Examples: Microsoft Access.

• Server/Centralized Database:

○ Definition: A database hosted on a server, accessed by multiple users over a network.

○ Examples: Oracle Database, SQL Server.

• Distributed Database:

○ Definition: A database where data is stored across multiple locations but appears as a single database to users.

○ Examples: Apache Cassandra, MongoDB.

Blockchain

• Definition: A decentralized ledger technology that records transactions across multiple computers securely.

• Advantages: Enhances transparency, security, and traceability.

• Examples: Bitcoin, Ethereum.

Database-Related Careers

• Database Administrator (DBA):

○ Function: Manages and maintains database systems.

○ Skills: Data security, backup, and recovery, performance tuning.

• Database Programmers:

○ Function: Develops and writes programs to manage and manipulate data within a database.

○ Skills: SQL, database scripting.

• Database Analysts:

○ Function: Analyzes data to support decision-making processes.

○ Skills: Data modeling, statistical analysis.

• Database Project Managers:

○ Function: Manages database projects, ensuring they are delivered on time and within budget.

○ Skills: Project management, team leadership.

Data Collection Methods

• Online Forms: Digital forms used to collect data via the internet.

• RFID: Radio-frequency identification used for tracking items.

• Digital Sensors: Devices that collect data from the physical environment.

• Invisible Data Collection: Data collection without explicit user input, like cookies.

• Digital Footprint: The trail of data left by users' interactions online.

• Transaction Tracking: Monitoring and recording the details of transactions.

Data Warehousing vs. Data Mining

• Data Warehousing:

○ Function: Storing large volumes of structured data for analysis.

• Data Mining:

○ Function: Extracting useful patterns and knowledge from large datasets.

Computer Systems for Problem Solving

• DSS (Decision Support System):

○ Definition: Computer-based systems that support business or organizational decision-making activities.

○ Examples: Financial planning software.

• Expert System:

○ Definition: A computer system that mimics human expertise to solve specific problems.

○ Examples: Medical diagnosis systems.

Data Security

• Record Lock: Prevents simultaneous access to a record to avoid data conflicts.

• Audit Trail: A record of all changes made to a database for security and accountability.

• Rollback: Reverting a database to a previous state in case of errors.

• Parallel Data Sets: Duplicates of data to ensure availability and reliability.

Database Design Concepts (Gr 11, 12)

Relationship between Data, Information, Knowledge, and Decision-Making

• Data: Raw facts and figures without context.

• Information: Data processed to be meaningful.

• Knowledge: Information interpreted based on experience or learning.

• Decision-Making: The process of making choices by analyzing information and knowledge.

Characteristics of Quality Data

• Accuracy: Data should be free from errors.

• Correctness: Data should be correct and reliable.

• Currency: Data should be up-to-date.

• Completeness: Data should be complete and contain all necessary information.

• Relevance: Data should be relevant to the context and purpose.

Verification and Validation Techniques

• Verification: Ensures data is correctly entered and consistent.

• Validation:

○ Format Check: Ensures data is in the correct format.

○ Data Type Check: Ensures data is of the correct type (e.g., numerical, textual).

○ Range Check: Ensures data falls within a specified range.

○ Check Digit: A digit added to the end of a number to validate its authenticity.

Database Design

• Elements of a Database: Tables, fields, records, and relationships.

• Field Data Types: Specifies the type of data stored in a field (e.g., integer, text, date).

• Tables and Relationships: Organizes data into tables, with relationships defining how tables relate to each other.

• Keys:

○ Primary Key: Uniquely identifies a record in a table.

○ Foreign Key: A field in one table that links to the primary key in another table.

• Entity Relationship Diagram (ERD): Visual representation of entities and their relationships in a database.

Characteristics of a Good Database

• Data Integrity: Ensures the accuracy and consistency of data.

• Data Independence: Data is independent of the applications that use it.

• Data Security: Protects data from unauthorized access or alterations.

• Data Maintenance: Ensures data is regularly updated and maintained.

Problems with Databases (Anomalies)

• Types:

○ Insertion Anomaly: Difficulty adding data to the database.

○ Update Anomaly: Inconsistencies arise when updating data.

○ Deletion Anomaly: Unintended loss of data when deleting records.

• Solutions: Normalization, proper database design.

Preventing Anomalies (Normalization)

• Definition: The process of organizing data to reduce redundancy and improve integrity.

• Entities as Tables: Each entity is represented as a table.

• Multiple Values in a Field: Avoid storing multiple values in a single field.

• Redundant Fields: Eliminate unnecessary duplication of data.

Entity Integrity

• Definition: Ensures each table has a primary key and that the values are unique and not null.

Relational Database

• Definition: A database structured to recognize relations among stored data.

• Referential Integrity: Ensures that relationships between tables remain consistent.

Social Implications (Gr 10, 11, 12)

Computer and Society

• Impact: Computers have transformed communication, education, business, and daily life.

• Ethics: Addresses issues like privacy, intellectual property, and the digital divide.