Looks like no one added any tags here yet for you.
definition: database
A database is a shared collection of logically related data and its description, designed to meet the information needs of an organization.
data:
integrated from several sources: different departments in a company
stored in a standared format for many different application
definition: database management system
a database management system is a software system that enables users to define, create, maintain and control access to the database
database management system: DBMS (1)
general purpose software used to manage electronically stored database
maintains data independent of application programs
minimizes data redundancy
e.g. - ms access, MySql
what does a database system consist of
database
DBMS
computer hardware
why reduce redundancy
data redundancy can lead to inconsistencies
DBMS: (2)
allows re-use and sharing of the data by many users
ensures: data consistency and integrity
data consistency
same data across applications
same data even when replicated
integrity
up-to-date data
correct data
DBMS: (3)
enables many users to access data concurrently
allows data acess control appropriate to individual user
A simplified database system environment
roles and skills in a db environment
database administrator (dba)
database designer
app programmers
end users
database administrators
oversee and manage resources: vm, utilized hardware
authorization: for common usage
security policy: app dependant
performance, tuning: can change over time
database designers
identify data to be stored and structured to be used
identify constraints
design structural and functional aspects
application programmers
develop and implement end-cuser requirements
end users
may be sophisticated/naive, regular/casual.
varying degrees of access
stages in database development
problem analysis
database design
database implementation
database monitoring/tuning
problem analysis
identify and understand:
data to be collected
requirements
constraints
facts and enterprise rules etc…
database design
produce data models
entities and their relationships
database implementation
structured data in the physical database
database monitoring/tuning
monitor db usage
re-structure and optimise databse
data models
specify the concepts that are used to describe the db : its content
three categories of data models
high level - conceptual data model
representational level - logical data model
low level -physical data model
conceptual data model
concepts at user level:
entities/ classes
attributes
relationships/ associations.
independent of DBMS
may be presented to user
conceptual data model example
logical data model
e.g relational, network, hierarchical
may be record based, object oriented
physical data model
Details of physical data storage, formats, access paths, ordering…
Both depend on type of DBMS
And are typically not to be accessible by user
logical data model example
physical data model example
types of logical data modes
relational model
entity-relationship (ER) model
object-oriented model
relational model
forms basis of current database management systems
entity relationship model
a diagrammatic technique that proves a generalised approach to the representation of data
helpful in the design of the relational database system
object oriented model
Has become more prominent in recent years particularly applied to data management in the form of the object-oriented database
Now competes with the relational database for certain applications
What are the two main types of Database Languages?
Data Definition Language (DDL)
Data Manipulation Language (DML)
What does DDL include?
DDL includes structure and view definition languages.
What are the characteristics of DML?
High or low level (declarative or procedural)
Standalone or embedded in a host procedural language
Set-oriented or record-oriented
What is the single language most DBMS provide?
Most DBMS provide a single Structured Query Language (SQL) consisting of all required functionality.
definition: relational model
a relational db is a collection of related tables
purposes of a function
mapping sets of inputs to sets of outputs
mapping from an input value to a specific output value
each input relates to exactly one output
what is a relation/table
is a logical representation of data
is physically represented as a table
it allows us the precisely explain what happens in a db
types of relational model
logical
formal
data
logical model
independent of physical implementation
formal model
uses set theory and predicate logic
data model
Concepts used to describe the database, i.e., the contain information
Simple but powerful; still the foundation for databases
three aspects of data
data structure
data integrity
data manipulation
data structure
table, column, domain, row
entities and their relations
data integrity
null, entity Integrity, referential integrity, general constraints
(rules that ensure the data is accurate, consistent and valid)
data manipulation
relational algebra
(using relational algebra to retrieve, edit and add data to our database)
what does a relational model consist of
data structure
data constraints
relational operators
data constraints
Restrictions on data, relations, …
relational operators
Restrict, project, cartesian product, union, difference, …
objective of a relational model
To obtain a high degree of data independence – allow changes without affecting the applications that use it
To control redundancy – data not stored in multiple places/ tables
To maintain consistency – the state of copies are the same, accurate & up-to-date
To enable set-oriented manipulation languages to be used (e.g. relational algebra)
row
records or tuple
A row in a table represents individual instance of the entity
Each row is uniquely identified by one or more attributes, i.e. the primary key
No part of the primary key may have a null value (entity integrity rule)
column
field or attribute
Each column heading in a table has a distinct name
Its value must be atomic, i.e.,a single value drawn from the domain of the column
table
relation
has rows and columns
represents a kind of entity or class
each table is has a unique name for id
properties: no duplicate rows, no significance to row and col order, col has unique name
domain (1)
Set of possible values that a column can take
Every column in a table is defined on a domain
Different domains are unrelated: restricts comparison between col and prevents meaningless operations
columns in table may have the same domain: SID, CID
domain (2)
may be potentially infinite - in the real world
constrained
Key
one or more columns that determine (imply) other columns
Or: one or more columns to uniquely identify a row
types of keys:
candidate
primary
alternate
foreign
composite
candidate key
A minimal set of columns which uniquely identifies each row
primary key
The chosen candidate key to uniquely identify each row
alternate key
the remaining candidate keys
If there are more than one candidate keys in a table and one of these is chosen as the primary key, the remaining candidate keys are referred to as alternate keys
foreign key
Columns (or a set of columns) that matches the candidate key of the associated table
Used to represent relationship between tables
composite key
Refers to a key consisting of two or more columns
null values
A NULL value is required to deal with missing information: value unknown, inserted later..
not the same as “ “ (blank) or 0
db constraints
The definition of a database encompasses integrity constraints:
rules to ensure that the db doesn’t have ‘invalid’ values and configs
means the data in the db is accurate and reliable
what could be db constraints
app/db specific: equal to enterprise or business rules
integirty constrains:
Entity integrity – each row is unique and identifiable
Referential integrity – rules ensuring the accuracy and consistency of data between tables
multiplicity
required data- primary key not null
domain constraints
integrity constraints
Entity integrity rule:
constraint on primary keys
Value of a primary key must be unique and
not null
Referential integrity rule:
constraint on foreign keys
Value of a foreign key must either match the candidate key it refers to
or be null
what does a relational model comprise of
structural definition
integrity rules
data manipulation
main languages used to deal with data manipulation
relational algebra
relational calculus
relational algebra (1)
Is a theoretical language
Provides a collection of operations used to manipulate relations
Is implementable
Basis for SQL query execution
visual summary of operations
natural join
restrict operation
defines horizontal subset table
called a unary operator as it works on a single tables
project operation ⫪
defines vertical partitioning of a table
intersection operation
set intersection on tables:
only rows contained in both
natural join operation ⋈
cartesian product, removing common duplicated columns
relational algebra is a procedural language
query is evaluated in the order specified
complex queries are evaluated by:
Nesting operations to create a relational expression
or applying operations one at a time and creating temporary tables
processing complex queries
most efficient one is preferable
relational algebra Vs Relational calculus - similarities
they are formal, theoretical - based on maths principals, non user friendly
used as a basis for data manipulation
lang for relational db
used to define operations to manipulate tables
equivalent in terms of expressive power
relation algebra Vs relational calculus - differences
relational algebra - is procedural , specifies how to achieve
relational calc - is declarative , specifies what to achieve
relational algebra (2)
operations work on one or more tables to define another table without editing the og tables
both operands and results are tables so output from one operation can become input to another operation
allows expressions to be nested
closure
closure
property where operations produce only tables based on elements of the input tables
unary
unary - 1
relational algebar operations 5+2
5 basic operations:
restrict (unary)
projection(unary)
cartesian product
union
set difference
2 operations expressed in terms of basic operations:
join
intresections
relational algebra symbols
visual interpretation summary
comparison operators
<, >, <=, >=, =, ≠
boolean operators
∧ - and ∨- or ∾- not
set operations
∪ = union
∩= intersection
- = difference
∪ union
combines two tables
removes all duplicate rows
∩ intersection
returns rows that appear in both tables
- = difference
returns rows that are in on and not the other table
T ← R - S
output all the rows that are in R but not in S
cartesian product (X)
multiplies two tables
PQ ← P x Q
types of join operations
equi-join
natrual join
outer join : left outer join, right outer join, full outer join
join ⋈
combines 2 tables to form a new one
derivative of the cartesian product where all elements satisfy p
col involves in ajoin condition p must have the same domain
equi - join
condition of equi- join contains only equality comparisons
outer join
used when displaying rows in the result that don’t have matching values in the join col
avoids info loss and includes unmatched rows in the result table
problems with natural join
rows not included when:
join cols are null or dont have matching rows in another tables
can cause loss of data