Paper 2 Option A Databases HL SL
Option A Databases IB Computer Science
Introduction
Includes both SL and HL content.
Expect the longest video in this series.
Refer to the description for study guides.
Data vs. Information
Data: Unprocessed raw facts, lacking context (e.g., just text and numbers).
Information: Data that is processed and structured, providing context and meaning (e.g., Jason got a 95/100 on the Final Exam).
Example: Jason scored 12 points higher than the class average. He got the 2nd highest score in the class.
What is a Database?
An organized collection of related data.
Comprised of tables interconnected through relationships.
Managed by software called Database Management System (DBMS) (e.g., SQLite, MySQL, PostgreSQL).
Utilizes queries in SQL for data manipulation (store, retrieve, delete, edit).
Foreign Key Example
Orders Table
order_id, product, total, customer_id
Customers Table
id, first_name, last_name, age, country
Why Use a Database?
Enables data processing/querying (search/sort/analyze).
Generates reports in specific formats.
Supports data validation and verification during interaction.
Standardized across various systems.
Handles more records than spreadsheets.
Data Validation vs. Data Verification
Data Verification: Ensures data matches expected values.
Example: Checking login details against stored values.
Data Validation: Ensures data input meets certain rules.
Example: Checking credit card number format.
What is a Table?
Contains records.
Each row represents a unique record, while each column is a field.
Example: Castle name, construction date.
What is an Entity?
A real-world object or person represented by a row in a database.
Referred to as a record or tuple in database terminology.
What is a Primary Key?
A unique field for each record, used to identify records in a database.
What is a Foreign Key?
A field in one table that refers to the primary key in another.
Establishes relationships between records in different tables.
Data Types
Different types include:
Text: Combination of letters, numbers (e.g., “Amith”).
Boolean: True/False, Yes/No.
Integer: Whole numbers (e.g., 5).
Real: Decimal numbers (e.g., 22.39).
Datetime: Date and/or time.
Creating Our First Table
Connect to the database.
Create a SQL query to define a table with fields: NAME, AGE, ADDRESS, SALARY.
Execute the query.
Close the database connection.
SQL Basics: SELECT Statements
Used to retrieve data from tables.
Example:
SELECT ProductName FROM Products.Use
WHEREclause for filtering data.SQL Cheat Sheet:
Select Field1, Field2, …*FROM TableName
Select Field1, Field2, …*FROM TableName WHERE Some_Criteria
Select Field1, Field2, …*FROM TableName WHERE Some_Criteria LIKE Some_String
Data Integrity and Redundancy
Data Integrity: Ensures accuracy and consistency of data.
Data Redundancy: Occurs when data is duplicated across tables, leading to inconsistencies.
Normalization
Process of structuring a relational database to reduce redundancy and improve data integrity.
Aim to divide larger tables into smaller related tables.
Advantages of Normalization
Efficient data storage.
Consistency in data across the database.
Enhanced data security.
Simplified updates and queries due to organized structure.
Database Administration and Security
Database Administrator (DBA) ensures security, performance, and recoverability of the database.
Implements measures like access rights, audit trails, and data locking.
Database Management Systems (DBMS)
Programs that handle database operations.
Components include data dictionaries, query processors, and storage engines.
ETL Process in Data Warehousing
Extraction: Collecting data from various sources.
Transformation: Formatting and cleansing data for quality assurance.
Loading: Transmitting data to the data warehouse.
Data Warehouse vs. Database
Data warehouses are optimized for complex querying and historical data.
Databases are for current data and transactions.