What is an entity?
An item of interest about which information is stored
What is a relational database?
A database which recognises the differences between entities
What is a flat file?
A database that consists of a single file, doesn’t necessarily need a primary key
What is a primary key?
A unique identifier for each record in the table, for example CarID for a table of car information
What is a foreign key?
The attribute which links two tables together. Exists as primary in one table and foreign in another.
What is a secondary key?
Allows a database to be searched quickly, acts as a secondary index to order and search
What is one-to-one database relation?
Each entity can be linked to 1 other entity, such as husband and wife
What is one-to-many database relation?
One entity can be related to many other entities, such as a mother with multiple children
What is many-to-many database relation?
One entity can be associated with many other entities and the same applies the other way round
Entity relationship diagrams
Bird foot diagram
What is normalisation?
Process of coming up with the best possible layout for a relational database
What are the 3 types of normalistation?
First normal form, second normal form and third normal form
What is first normal form?
No attribute that contains more than a single value, as well as no repeating groups of attributes
What is second normal form?
A database which doesn’t have any partial dependencies and is in first normal form.
What is 3rd normal form?
If database is in 2nd form and contains no non-key dependencies it is in third normal form
Ways of capturing data
Manual entry, Magnetic Ink Character Recognition, Optical Mark Recognition, Optical Character Recognition
MICR
Magnetic Ink Character Recognition, used by banks for cheques
OMR
Optical Mark Recognition, used for multiple choice questions on a test
OCR
Optical Character Recognition, used for other forms of responses on tests than multiple choice
Selecting and Managing data
Important part of data pre-processing, involving selecting data that fits a certain criteria to reduce the volume of input
Exchanging data
Transferring the collected data, commonly EDI (electronic data interchange), which requires no human interaction
SQL
Structured query language, declarative language used to manipulate databases
SELECT
Used to collect fields from a given table
FROM
Specifies which table information will come from
WHERE
Specifies the search criteria
ORDER BY
Specifies whether you want it in ascending or descending order
JOIN
Combines rows from multiple tables based on a common field between them
CREATE
Allows you to create new databases
ALTER
Add, delete, modify
UPDATE
Update a record in a database table
DELETE
Delete a record from a database table
Data types
CHAR, VARCHAR, BOOLEAN, INT, FLOAT, DATE, TIME, CURRENCY
Referential Integrity
Process of ensuring consistency, such as making sure data is not removed if it is required elsewhere. No foreign key in one table can reference a non-existence record in a related table.
Transaction processing
Single operation executed on data
ACID
Atomicity, consistency, isolation, durability
Atomicity
Transaction must be completed in its entirety or not at all
Consistency
Must maintain referential integrity rules
Isolation
Simultaneous executions of transactions should lead the to the same result as if they were run one after another
Durability
Once a transaction is executed it will remain so regardless of other circumstances
Record Locking
Preventing simultaneous access to records in a database in order to prevent inconsistencies or loss of updates
Deadlock
Problem with record locking where two or more transactions are waiting for each other to release locks, causing a standstill.
Database
The simplest kind is a flat file, containing information about at least one entity
How are entity descriptions written
Customer (custID, title, firstName, surname. email), where custID is the primary key
Composite primary key
A primary key that consists of two or more attributes to uniquely identify a record in a database table, such as OrderLine (OrderNumber, OrderLine, ProductID)
Timestamp ordering
Deadlock prevention - when a user saves an update if the read timestamp is not the same as it was when they started the transaction then another user has accessed the same object
Commitment ordering
Transactions are ordered in terms of dependencies on one another as well as time they were initiated.
Redundancy
Backup systems which automatically take over when the main system fails