1.3.2 Databases

studied byStudied by 6 people
5.0(1)
Get a hint
Hint

File (table)

1 / 86

flashcard set

Earn XP

Description and Tags

87 Terms

1

File (table)

A collection of related records which contain the same type of information as all the other records

New cards
2

Record

All of the data relating to one item or “thing”.

New cards
3

Field

Stores one item of data in a record

New cards
4

Flat file databases

Usually only has one table (it may have more than one table, but with no links between them)

New cards
5

Advantages of Flat file databases

Easy to set up and understand – no specialist knowledge needed

New cards
6

Disadvantages of Flat file databases

  • May have redundant data

  • Potential duplication of data

  • Harder to update the data due to duplication

  • Records are not unique

  • Security is poor as if access is gained all data is readily available

  • Harder to perform complex queries

New cards
7

Relational databases

This type of database has more than one table which are linked together by primary and foreign keys.

New cards
8

Primary key

Used to uniquely identify a record

New cards
9

Foreign key

A foreign key is a primary key in one table which is sued as an attribute in another table to provide a link between table.

A foreign key in one table is always linked to a key field in another table.

New cards
10

Compound key

A combination of fields that makes a record unique (but breaks 2 Normal Form)

New cards
11

Secondary key

Made on a field that you would like to be indexed for faster searches

They may not be unique

They are used for sorting or searching data

New cards
12

Advantages of relational data bases

  • Data is only stored once – this avoids data duplication and means that data is only stored once.

  • Data integrity is maintained

  • Easy to modify the format

  • Security is better as sections of the data can be protected individually using authorisation

  • More “future proof” as it is easier to update and expand the basic structure than a flat file.

  • More complex queries can be run over the data

New cards
13

Disadvantages of relational databases

They are difficult to set up

New cards
14

Data integrity

  • The correctness of the data over the lifetime of that data.

  • This means that the data is correct when it enters the database, as it is queried and updated up until the point it is removed from the system.

  • Data must be complete, accurate and consistent.

New cards
15

Entity integrity

Every table must have a primary key which is unique and not empty

New cards
16

Referential integrity

Foreign keys must point to the primary key of another table but can be empty (meaning that there is no relationship)

New cards
17

Domain integrity

All attributes in the database are related to the overall domain that the database is working on.

New cards
18

Data redundancy

This refers to the unnecessary duplication of data.

New cards
19

Data consistency

  • When data is held in more than one file it should be stored in a consistent way.

  • A date field could be stored as a text field in one table but as a date/time field in another, meaning the data would be incompatible.

  • In a relational database because the attributes of any one entity are contained within one table, there is no risk of it being stored in a different format in a different file

New cards
20

Data independence

  • This refers to the separation of the data from the programs that use it.

  • This means that users can view data without having to be aware of the basic structure of the database, and allows them to manipulate it in many different ways.

  • User can also not make changes to the underlying structure of the database.

New cards
21

Flat file vs Relational data bases

Flat file

  • May have redundant data, which increases the overall size of the database and reduces data integrity

  • Harder to update the database design

  • No specialist knowledge needed to set one up

Relational Databases

  • Data integrity is maintained

  • Easier to update the database design

  • Provides security features e.g. individual tables can be protected

  • Need specialist knowledge to set up and maintain

New cards
22

ERD

Entity relationship diagram

New cards
23

Attributes

Describes the facts, details or characteristics of an entity.

These are the fields within a table

New cards
24

Relationships

The links between entities.

They can be one to one, one to many, or many to many.

New cards
25

One to one relationship

One thing to one thing

e.g A Husband can only have one wife and a wife can only have one husband

New cards
26

One to Many relationship

One thing to many things

e.g A mother can have many children but a child can only have one mother.

New cards
27

Many to many relationships

Many things to many things

e.g A book can be loaned by many customers in a library and a customer can loan many books

New cards
28

How can a many to many relationship be changed to a one to many relationship?

This is usually achieved by adding another entity which acts as a link table and:

  • Creates one to many relationships

  • Makes the relationship in to 3NF

  • Reduces data inconsistencies

New cards
29

What is the standard notation for describing entities and attributes?

  • The entity is always written in capital letters

  • The primary key is underlined

  • Foreign keys are overlined

New cards
30

Normalisation

The process of refining the structure of a database to minimise redundancy and improve integrity

When a database has been normalised, it is said to be in normal form

Helps organise the data efficiency, eliminate redundant data, ensure that only related data are stored in a table

New cards
31

UNF

Un-normalised Form

Data that has not been normalised

It potentially has:

  • repeated data

  • inconsistent data

New cards
32

1NF - First Normal Form

A table is in this form if it contains no repeating attributes or groups of attributes, data is atomic, AND has a primary key

New cards
33

Atomic data

Information which cannot be logically broken down in to smaller parts.

E.g. Full name is not atomic as it can be split in to first name and surname

First name is atomic as it cannot be split any further without losing detail.

New cards
34

2NF - Second Normal Form

In 1NF AND it contains no partial key dependencies

If a table has a single attribute for a primary key it is usually in 2NF as all of the fields depend on that key.

It is when there is a compound primary key (a key made up of 2 or more fields) that something needs to be changed.

New cards
35

3NF - Third Normal Form

A table is in this form if data items are dependent on the key, the whole key and nothing but the key

Many-many relationships are not allowed in this form as it causes duplication of data.

New cards
36

Partial key dependencies

Occurs when one primary key determines some other attribute/attributes

New cards
37

Serial Access

  • The records are stored chronologically

  • New records are appended onto the end of the file

  • There is no way to find a particular record except by searching through all of the preceding records

New cards
38

When is Serial file organisation used?

  • The order of the data doesn’t really matter

  • The total number of records is small

  • Data MUST be stored in the order it is received

New cards
39

Sequential Access

  • Records are stored one after another

  • Ordered in a logical sequence based on a key field (usually primary key)

  • In order to find a record, the whole file may still need to be searched (like in serial), however, it tends to be faster and more efficient than serial

New cards
40

Name two types of random access methods

  • Indexed file

  • Hashed file

New cards
41

How do you delete records in a serial file?

  • Deleting records in a serial file is complex

  • The file is scanned until the record is found

  • Everything is copied except for the file you want to delete, the copy then replaces the old list

New cards
42

How do you add/delete a record in a sequential file?

  • All records are in order of the key field so you cannot just add a new record to the end of the file

  • The file is copied up to the point of insertion, the data is inserted and then the remainder of the file is copied

  • The file is copied up to the point of the data to be deleted, this record is skipped and then the remainder of the file is copied

New cards
43

Indexed Sequential File Organisation

  • Records are sorted according to a primary key but a separate index is also kept.

  • The index is an additional file that contains information about where records are stored and allows them to be accessed directly.

  • This could mean that groups of related records are physically stored in groups

New cards
44

How do you add/delete records in an indexed sequential file?

  • Every time a record is added or deleted, it can affect the start and end point of the indexed “chunks” and so all indexes from that point onwards need to be updated.

    • The index needs to be re-created after each addition or deletion, which is performed as per standard sequential files.

New cards
45

Random (direct) Access

  • A hash algorithm (a mathematical algorithm) is used on key fields to generate an  address for the file - this calculates a disk address/location of a record based on its key field.

  • The algorithm MUST produce the same address every time a key is used so that the data doesn’t get lost!

  • Sometimes several records are linked to the same location, which means you have to search sequentially through those records

New cards
46

Collision

When too simple a hashing algorithm is used, several records could hash to the same address.

New cards
47

RDBMS

Relational Database Management System

  • Used to maintain data integrity

    Uses SQL which allows you to crate, delete, and alter the database

  • Has a way of documenting all the internal structures that make up the database

  • The ability to view the database from different viewpoints according to the requirements of the user

  • Provide some level of security and access control to the data

New cards
48

Data dictionary

  • Used by database managers

  • Is a file containing descriptions of the data stored in tables in a database

  • Provides metadata (information about data and how it is stored)

  • Describes the structure and attributes of data 'items' and includes the names and descriptions of tables and the fields contained in each table, as well as information about the data type, field length etc.

    • Used when altering the structure of the database

New cards
49

Field valuation

Presence checks, lookup, range checks, picture check which are applied to a field

New cards
50

Indexes

Any field that has been indexed to improve search speed

New cards
51

Access rights

Which users can change / edit / modify etc. data

New cards
52

Data Manipulation Language (DML)

  • Provides a means of manipulating data within the database

  • Includes operations such as CRUD (Create, Read, Update, Delete)

  • Many systems allow the user to set a task by using ‘Query by Example’

Used to:

  • access/query/sort/search data

  • store/insert data

  • change/update/delete data

New cards
53

Query by Example

  • Users are presented with a graphical view of the tables and use a number of icons such as 'filter' to manipulate the data.

  • With some systems you can drag fields graphically into appropriate areas on the screen to set up the query.

  • Behind the scenes, the QBE tool is compiling and running the required DML commands.

  • This means that users do not need to have a sophisticated understanding of DMLs  to use the database.

New cards
54

Data Definition Language (DDL)

  • Helps to provide a method of creating a database from scratch

  • Used to write schema (designs) which describes data items to be stored and the relationships between them.

  • Allows tables to be created and primary/foreign keys to be defined.

  • Also defines validation rules

New cards
55

Name three views of data

  • Physical view

  • Logical view

  • User view

New cards
56

Physical view

  • This view exposes all the internal details of how the data is stored on the physical computer

  • For example the data structures used in memory, the nature of the indexes used for searching, what physical devices are used to store the database and so on.

    • At this level the operating system is interacting with the database

New cards
57

Logical view

It shows:

  • a model of all the information within the database

  • The kind of records it holds

  • The type of authorisation and validation methods used.

This is the kind of view the database designer is using to set out the whole database

The data dictionary is produced at this level.

New cards
58

External view

This is the view that sets out what an authorised user can see of the database.

It sets out the data they are allowed to see and their authority level to change the records.

New cards
59

Opening a file

myFile = openRead(“sample.txt”)

x = myFile.readLine()

myFile.close()

New cards
60

Opening a file and printing its contents

myFile = openRead(“sample.txt”)

while NOT myFile.endOfFile()

print(myFile.readLine())

endwhile

myFile.close()

New cards
61

Writing to a file

myFile = openWrite(“sample.txt”)

myFile.writeLine(“Hello World”)

myFile.close()

New cards
62

Indexing

  • A method used to store the position of each record ordered by a certain attribute.

  • This is used to look up and access data quickly.

  • The primary key is automatically indexed; however, the primary key is almost never queried since it is not normally remembered.

  • This is why secondary keys are used. Secondary keys are indexed to make the table easier and faster to search through on those particular attributes.

New cards
63

Name three ways of capturing data

  • Optical Mark Recognition (OMR)

  • Optical Character Recognition (OCR)

  • Manually entering it

New cards
64

What is exchanging data?

The process of transferring the collected data to a database

New cards
65

Selecting and Managing data

  • Selecting the correct data is an important part of data preprocessing

  • This could involve only selecting data that fits a certain criteria to reduce the volume of input.

  • Collected data can alternatively be managed using SQL to sort, restructure and select certain sections.

New cards
66

Things to know about SQL commands

  • SQL commands are in capital letters

  • Multiple fields are separated by commas

  • Alter and drop are to do with structure

  • Update and delete from are to do with data

New cards
67

How do you add in a new field?

ALTER TABLE tablename ADD columnname datatype

New cards
68

How do you remove an existing field?

ALTER TABLE tablename DROP COLUMN columnname

New cards
69

How do you create a table within a database? (e.g Clubs)

CREATE TABLE Clubs ( ClubID INT NOT NULL, ClubName VARCHAR(50) NOT NULL, PRIMARY KEY (ClubID))

New cards
70

How do you insert data into a database?

INSERT INTO databasename.tablename (field1, field2) VALUES ('1', 'The Pullman Club');

New cards
71

How do you change a record?

UPDATE databasename.tablename Fieldname = ‘data’ WHERE tablename.primarykeyname = “primarykey”

New cards
72

How do you extract a sub-set of the table records?

SELECT * FROM tablename WHERE primarykey>1

New cards
73

How do you return a single record?

SELECT * FROM tablename WHERE primarykey=2

New cards
74

How do you include more than one table in the search?

SELECT fieldname(s) FROM table1

JOIN table2

ON table1.primarykey = table2.foreignkey

New cards
75

How do you delete records?

DELETE FROM tablename WHERE primarykey = 1

New cards
76

Methods of exchanging data

Data can be transferred:

  • Automatically using the DBMS software

  • By typing it in using a customised form

  • Importing it from another type of file

  • Using EDI (Electronic Data Interchange) this is used to transfer data between one computer system and another

  • APIs (application programming interfaces) uses web services to communicate using the HTTP protocol

  • SOAP a standardised protocl that sends messages using HTTP and SMTP

  • JSON (JavaScript Object Notation)

  • XML (Extensible Markup Language) used to separate data from its presentation on web pages

No matter which method is used, validation should occur to ensure that the data is sensible along with verification to ensure that the data entered matches the original

New cards
77

Transaction Processing

A single logical operation on data (e.g a customer booking a cinema ticket online, even though it involves multiple actions)

The database must ensure that it is not possible to only complete part of a transaction.

e.g. if seats have been successfully reserved, but payment fails, the seats should not stay reserved or change status to booked.

ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that transactions are processed reliably.

New cards
78

Atomicity

Ensures that a transaction must be processed in its entirety or not at all

It must guarantee that in any situation, including power cuts or hard disk failure, it is not possible to only process part of a transaction.

Transactions should either succeed or fail but never partially succeed.  

New cards
79

Consistency

The transaction should only change the database according to the rules of the database in order to maintain its integrity

Changes to data in one table must take into account data in linked tables

This means that it will not be possible to create or delete a record if related records exist in linked tables e.g.

  • an employee cannot be deleted if they have salary details in another table

  • you can’t create a salary for an employee that doesn’t exist

New cards
80

Isolation

This property ensures that concurrent execution of transactions leads to the same results as if the transactions were processed one after the other.

i.e. Each transaction should not overwrite other transactions that are being processed at the same time

This is particularly relevant in a multi-user database where each transaction must not affect/overwrite other transactions being processed at the same time

New cards
81

Durability

Ensures that once a transaction has been committed, it will remain so, even in the result of a power cut.

e.g. in an online cinema ticket purchase, it should not be possible for the number of seats sold to be updated if the credit card transaction fails

As each part of the transaction is completed, it is held in a buffer on disk until all elements of the transaction have been completed.

Only then will the changes to the database be made.

New cards
82

Multi-user Database

These need to allow for concurrent processing

If multiple users need to read the same data, it generally isn't a problem, but allowing multiple users to simultaneously update or  write to a database table may cause one of the updates to be lost unless measures are taken to prevent it.

When an item is updated, the entire record will be copied into the user’s own local memory (on their local device).

When the record is saved, the data in memory is then rewritten in its entirety.

This can lead to unexpected results

New cards
83

Record locking

In its simplest form, the record is “locked” when the first user obtains the data and cannot be accessed by any other user until it is “unlocked” (saved).

However, this can cause problems…

If two users try to access a single record at the same time a situation can arise where neither can proceed, known as deadlock.

The DBMS (Database Management System) must recognise if this happens and must deal with the deadlock.

New cards
84

Serialisation

When two or more transactions are executed concurrently, the effect should be the same as if they had been executed serially i.e. one after the other.

It ensures that transactions do not overlap in time and therefore cannot interfere with each other or lead to updates being lost.

A transaction cannot start until the previous one has finished and this can be implemented using:

  • Timestamp ordering

  • Commitment ordering

New cards
85

Timestamp Ordering

Whenever a transaction starts, it is given a timestamp, so that it two transactions affect the same object, the transaction with the earliest timestamp should be applied first.

Each transaction is given a read timestamp and a write timestamp.

When a transaction starts, it reads a record of data, causing it to be read timestamped.

When the updated data is written back, the read timestamp is checked. 

If it is not the same in the database as the current record in memory steps are taken to solve the problem

New cards
86

Commitment Ordering

Transactions are ordered in terms of their dependencies on one another as well as the time they were initiated.

This involves evaluating the impact of the transactions before they are executed i.e. which data will be updated and how that will potentially affect other transactions

It can be used to prevent deadlock by blocking one request until another is completed.

Transactions are not completed in time order, but in commitment order

New cards
87

Redundancy

This could be a RAID(Redundant Array of Inexpensive Disks) setup or mirroring servers.

In its simplest(!) form, these organisations maintain two or three identical systems in different geographical locations, so that every transaction is written to two or three storage facilities.

This hardware redundancy protects against the loss of data in the event of power failure or other disasters.

If one system fails, one of the backup systems automatically takes over and processing continues.

If part of a database is lost, it can be recovered from one of the other versions.

New cards

Explore top notes

note Note
studied byStudied by 23 people
Updated ... ago
5.0 Stars(2)
note Note
studied byStudied by 41 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 11 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 46 people
Updated ... ago
4.0 Stars(1)
note Note
studied byStudied by 91 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 9 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 26 people
Updated ... ago
5.0 Stars(1)
note Note
studied byStudied by 30060 people
Updated ... ago
4.4 Stars(24)

Explore top flashcards

flashcards Flashcard36 terms
studied byStudied by 9 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard117 terms
studied byStudied by 66 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard27 terms
studied byStudied by 16 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard103 terms
studied byStudied by 16 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard47 terms
studied byStudied by 7 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard29 terms
studied byStudied by 15 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard46 terms
studied byStudied by 4 people
Updated ... ago
5.0 Stars(1)
flashcards Flashcard40 terms
studied byStudied by 65 people
Updated ... ago
5.0 Stars(1)