Instructor: Yuyang Ye
Email: yuyang.ye@rutgers.edu
Date: January 29, 2025
Definition: A database model is a conceptual framework for organizing and managing data.
Data Structures: Prescribe how data is organized.
Operations: Manipulate data structures.
Rules: Govern valid data use.
Relational Model:
Strengths:
Fast queries and efficient storage.
Productivity and simplicity in transactional applications.
Integration with object-oriented programming languages.
Flexible schema accommodating business requirements.
Support for unstructured and semi-structured data.
Other Models:
Hierarchical
Network
Object
Graph
Document (e.g. JSON, XML)
Examples:
IMS, IDMS, Oracle Database, ObjectStore, Neo4j, MongoDB
Primary Data Structures:
Tree, Linked list, Table, Class, Vertex and edge
Introduced by: E. F. Codd of IBM in 1970
Standardization:
Operations and rules standardized in SQL, the universal query language for relational databases.
Tables (relations) store data.
Rows (tuples) represent records.
Columns represent attributes.
Set: Unordered collection of elements (e.g. {a, b, c}
).
Tuple: Ordered collection enclosed in parentheses (e.g. (a, b, c)
).
A table:
Has a unique name.
Contains a fixed tuple of columns with varying sets of rows.
Column:
Has a name and a data type.
Row:
Unnamed tuple of values corresponding to columns.
Types of values determined by column's data type.
Select: Chooses a subset of rows.
Project: Chooses specific columns.
Product: Lists all combinations of rows from two tables.
Join: Combines tables based on related columns.
Union: Selects all rows from two tables.
Intersect: Selects common rows between two tables.
Difference: Selects rows unique to one table.
Rename: Changes a table's name.
Aggregate: Functions over multiple rows (e.g. sum, count).
Unique Primary Key:
All tables must include a unique primary key.
Unique Column Names:
No duplicate names within the same table.
No Duplicate Rows:
No identical rows across all columns.
Definition:
Based on specific business policies.
Examples:
Employee records must include valid department codes.
Passport Numbers must be unique.
SQL: Standard query language for relational databases.
Data Definition Language (DDL):
Defines database structure (CREATE, ALTER, DROP).
Data Manipulation Language (DML):
Manages data within schema objects (SELECT, INSERT, UPDATE, DELETE).
Data Query Language (DQL):
Queries data (e.g. SELECT).
Data Control Language (DCL):
Controls data access (e.g. GRANT, REVOKE).
Definition: Complete, executable instructions ending with a semicolon.
Literal: Explicit value (e.g. 'Maria', 129).
Keyword: Special meaning in SQL (e.g. SELECT, INSERT).
Identifier: Database object names (e.g. FirstName, Population).
Expression: Sequence evaluating to a value (e.g. Population > 1000).
Comment: Ignored by the processor.
Table:
Must have a unique name and at least one column.
Column:
Each column must have a unique name and data type.
Row:
Each row is an unnamed sequence of values corresponding to column data types.
Definition: Identifies each row uniquely in a table.
Types:
Simple Primary Key: Single field.
Composite Primary Key: Multiple fields.
Candidate key: Possible choices for primary keys.
Definition: Attribute matching primary key from another table.
Example: In CITY and STATE tables, StateAbbrev serves as a foreign key.
Visual representation of relationships between database tables.
Entities correspond to tables; relationships shown by connecting lines.
Open diagrams.net: Start a new diagram.
Add Entities: Use rectangles for entities and label them.
Define Attributes: Use ovals for attributes connected to entities.
Specify Relationships: Use diamonds for relationships, labeling connections.
Add Cardinality: Indicate relationship types (e.g. 1:M).
Entities: Customer, Orders, Product, Ordersdetail.
Relationships:
Customer has many Orders (1:M).
Orders has many Orderdetails (1:M).
Product can appear in many Orderdetails (1:M).
Common design issues include redundancy and poor normalization affecting data integrity.
Reduce redundancy by restructuring tables ensuring dependencies link to primary keys only.
Types of anomalies resolved through normalization:
Insertion Anomaly: Inability to add data due to missing information.
Deletion Anomaly: Unintended data loss via record deletion.
Update Anomaly: Multiple updates required for single information change.
Readings: Cover Chapter 4, SQL Workbench Tutorial sections.
Practice: Create ERDs using diagrams.net.
CREATE TABLE & DROP TABLE statements.
Example of ALTER TABLE in MySQL.
Familiarize with MySQL data types and characteristics.
Integer: INT, SMALLINT, etc.
Decimal: FLOAT, DECIMAL(M,D).
Character: VARCHAR, CHAR.
Date and Time: DATETIME, DATE.