Database System
computerized record keeping system that consists of related data
Database System Components
Users, Hardware, Software
User Support
Supports single or many users
Data is Integrated - unify distinct files while eliminating redundancy
Data is shared - different users will have different access and views to the same data
Database Administrators
create the data base and incorporate technical controls to enforce decisions made by the DBA
Why Database - Data Dictionary
stores definition of data and relationships, changes are automatically recorded
Why Database - multiaccess user support
Shared Data between users, multiple users can access the database at the same time without compromising integrity
Why Database - Reduced redundancy and inconsistency
Inconsistent data happens when one redundant data has been updated and the other has not
Why database - Transaction Support
Several updates in one transaction guarantees all or none are updated
Why Database - Data integrity support
ensures that the data in the database is accurate, reduce redundancy, increase consistency
Why Database - security
only means of access to database is through proper channels, promotes data privacy
Why database - backup/recovery
recovery after possible failures
Data Independence
changing info in one place without changing it in another
Changes in application programs without changing the structure of the underlying data
Database can grow without impairing existing applications
Disadvantages of the DBMS
Increased Costs: hardware and software costs, hiring experts
Management Complexity: different types of technology is harder to manage
Vendor Dependence: reluctantly to change the system once it has been established
Frequent Update/Replacement Styles: costs incurred to update and train people
Conceptual Schema Design - Stage 1
Choice of model: User requirements and real world concepts should go into design
Conceptual Schema Design - Stage 2
Normalization: Reduces complexity and avoids redundancy by adjusting diagrams
Conceptual Schema Design - Stage 3
Optimization: Promotes good performance of the database
Entities
real world entity that has an independent existence, Represented by a table and attributes
Strong Entity
exist on thier own and do not rely on others
Weak Entity
existence depends on the existence of another strong entity
Relationships
associates entities with one another
Mapping Constraints - One-to-One
manager can manage only one department but each department can only have one manager
Mapping Constraints - One-to-Many
Department can have several employees but each employee can only work in one department
Mapping Constraints - Many-to-Many
A supplier can supply parts to several jobs a jobs can receive parts for several suppliers
Total Participation
Relationship is total - (indicated by double lines):
EX: every department must be managed by a manager
Partial Participation
Relationship is partial between two entities (indicated by a single line):
EX: not every employee manages a department
Simple/Atomic Attributes
Not divisible (ex. Part #, weight)
Composite Attributes
Consists of several simple attributes (ex. Address)
Single valued attributes
unique value, not shared among data entries (ex. SIN#)
Multi valued attributes
attributes that can be shared among data (ex. College majors, skills)
Stored Attribute
are stored in the database (ex. Date of birth)
Derived Attribute
can be derived from stored information (ex. Age)
Key Attribute
unique attribute that is distinct for each individual entity instance and can be used to identify an entity
Candidate Key
minimal subset attributes that uniquely identifies an entity (ex. Employee #)
Can be specified using the unique clause
Primary key
candidate key chosen by designer to access each entity
Only one primary key per table
Can be specified after the variable type or before with the values that make up the key
Null Values
Represents not applicable, unknown missing information or not known information
Database Schema
created in the database design phase to describe the database and not expected to change
Database Instance
data in the database at a particular moment in time
Levels of Database Architecture (ANSI/SPRAC) - Internal Level
shows how data is stored inside the system such as file organization and access paths
Levels of Database Architecture (ANSI/SPRAC) - Conceptual Level
deals with the modeling of the whole database, conceptual schema is defined
Levels of Database Architecture (ANSI/SPRAC) - External Level
models user oriented description of parts of the database, views for users
Logical Data Independence
(external/conceptual): ability to modify conceptual schema without changing external views and application programs
Physical Data Independence
(internal/conceptual): modify internal or physical schema without changing conceptual or view level schema or application programs
Data Definition Language (DDL)
For defining schemas at various levels
Used by database designers and administrators
Data Manipulation Language (DML)
Used to construct and use the database
Used by end users for insertion, deletion, updates, retrievals
Subclass
allows sub groupings of entities
Members of a subclass inherit all the attributes of the superclass
Each sub class has its own attributes along with inherited attributes
Superclass
main type of the sub grouping
Specialization
Process of defining a et of subclasses of an entity type
Based on some distinguishing characteristic of the entity type
Multiple specializations can be defined on a single entity type
Specialization - Disjointness Constraint
Disjoint: an entity can be a member of at most one of the subclasses of the specialization
Overlapping: the same entity can be a member of more than one subclass of the specialization
Specialization - Completeness Constraint
Total: every entity in superclass must be a member of some subclass
Partial: an entity might not belong to any of the subclasses in the specialization
Generalization
Result of taking the union of two of more lower level entity types to produce a higher level entity type
Aggregation
Abstraction through which relationships are treated as higher level entities
Create new higher level entity called assignment to treat as the new entity
Relational Model
Uses tables called relations to represent a collection of related data values
Rows are called tuples
Columns are called attributes
Number of attributes (# of columns) is called the degree/arity
Relational Model - Domain
data type describing the values that can appear in each column
Domain is a set of atomic values that are indivisible
Super Key
key attributes and other attributes of a relation
Allows you to be more specific to find a record
Relations
values that can be read but not updated by definition
Relvars
variables that can be read and updated by definition
Optimization
system component that determines how to implement user requests
Catalog
set of system relvars that contain descriptors regarding various objects that are of interest to the system itself
Transaction
logical unit of work involving several operations that begins by begin transaction and terminates normally or abnormally
Atomicity
Transactions are guaranteed either to execute or not execute at all
Durability
once a transaction successfully commits its updates are guaranteed to be applied to the database even if the system fails
Isolation
database updates by given transactions are kept hidden from all distinct transactions unless the first transactions are successfully committed
Serializability
interleaved execution of concurrent transaction is guaranteed to produce the same result as executing the same transactions in an unspecified order
Commit (normal Termination)
operation that signals the end of a successful transaction
Rollback (abnormal termination)
operation that signals the unsuccessful end of transaction
SQL
Structural Query Language
Non procedural Language - no loops, conditionals or functions
Combine with other programming languages for more functionality
SQL Data Types
Numeric
Integer and floating point numbers
Integer - size to 40 digits, 4 bytes in DB2
Small int - smaller storage space, 2 bytes in DB2
Fractional Numbers - specify scale(default) and size or none
Character String
Fixed: CHAR(size) specified, reserved fixed amount in memory
VARCHAR(size), does not reserve max space
Date / Time
YYYY-MM-DD for dates
HH.MM.SS for times
Foreign Key
Combination of columns in one relationship that references primary key attributes of a second relationship
Uses the keyword references
Default Value
specifies a value to when no value is given
Check Conditions
Ensures that every row in the table satisfies a set condition
Condition can be any valid expression that evaluates to true or false
Referential Integrity Actions
Set Default - the attribute value is set to default value
Set Null - attribute value is set to null
Cascade - updates are propagated, each attribute value is updated
No Action
Alter Table
to update and modify an existing table by adding/update/delete an existing column
Add/Delete column by using ADD command
Add column at any time if not null is not specified
Modify column definition using MODIFY command
Can increase character column width, number of decimal place and number of digits at any time
Drop Table
Use DROP TABLE command at the bottom of the dependency chain to properly delete
Use CASCADE CONSTRAINTS to drop all constraints that refer to professor table
INSERT
Insert - adds a new row to the selected table
Insert by explicitly or implicitly specify a query
Insert results of a query into a table
DELETE
UPDATE
Views
Virtual Table opposed to a base table - specific view of a database group for users
Characteristics of Views
View is not created at definition, instead materializes on the first use
Do not need to create a physical table --> derive table from implementation
View tables are updated automatically if the base table info is updated
Purpose of Views
Performance optimization
Used when subqueries are repeated through existing queries
Security
Users only can be provided with data that is needed and not all of it
Provide read only access
Creating Views
SELECT
Use * to list all the attributes
If Using * the WHERE clause is optional
Can pass strings and conditional statements through
ORDER BY
Order by Descending (DESC) or Ascending (ASC)
Ascending order is the default
Specify ordering with multiple different attributes
AS
Serves as an Alias to rename an attribute to be more specific
DISTINCT
Eliminates duplicate values
LOGICAL OPERATORS
AND, OR, NOT can be used to combine simple conditionals
Precedence: NOT --> AND --> OR
JOINING TABLES
Same attribute names in different tables
Use . To specify the attributes from the tables
EXISTS
Used to test he existence of any record subquery
Returns true of conditions are met
UNION
Combines the result set of two or more select statements
Both tables must have the same number of columns and be similar types
MINUS/NOT IN
Not conditions for entries
Minus does not work for MySQL, you must use NOT IN
LIKE
Allows you to pull entries that are similar to a given condition
Can be used in place of an equals sign to a string
Case sensitive matching
% will match 0 or more characters
_ will match a single character
BETWEEN
Simple shorthand for a range restriction instead of using relational operators
Aggregate Functions
Take a multiset of values and return a single value
Average, Minimum, Maximum, Total/Sum and Count
GROUP BY
Group by a certain table column that have the same value
HAVING
List records that have a certain condition
Can be combined with conditionals to create constraints
Can be used with nested queries
Problems with Poor database design
Redundancy - duplicate data
Update anomalies - Direct consequence of redundancy
Insertion Anomalies - Cannot insert new data unless there is a record that applies to it
Deletion Anomalies - Lose important information from deletion
Decomposition
Split the relation into two or more relations
Use join to get back to the original information
1NF Normalization
A relation is said to be in 1NF if it contains only atomic, simple and scalar data
This data is usually comma delimited and contains multiple values associated with one record
Full Functional Dependency
FD between X -> Y is full if the removal of any attribute from X means that the dependency does not hold
Partial Dependency
FD between X -> Y is partial if there is some attribute AeX that can be removed from X and the dependency will still hold
2NF Normalization
A relational schema is in 2NF if it is at first in 1NF and every non key attribute is fully functionally dependent on the primary key / candidate key of the relation