Database Systems

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/101

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

102 Terms

1
New cards

Database

An integrated collection of data with structure

2
New cards

Gigabyte

230

3
New cards

Terabyte

240

4
New cards

Petabyte

250

5
New cards

Exabyte

260

6
New cards

Zettabyte

270

7
New cards

DBMS

Database Management System

8
New cards

Views (External Schema)

How users see and access data

(Many views)

9
New cards

Conceptual Schema

Logical structure of stored data

(One conceptual schema)

10
New cards

Physical Schema

How data is organized on disk

(One physical schema)

11
New cards

DBMS Developer

Designs and implements DBMS software

12
New cards

Database Administrator

Designs schemas for DB, handles security and authorization, data ability, crash recovery, tuning

13
New cards

Database User

Interacts with DMBS UI, query language or GUI

14
New cards

DBMS Architecture

UI (SQL), Query Optimization and Execution, Relational Operators, Files and Access Methods, Buffer Manager (moves pages to/from main mem), Disk Space Management, DB

15
New cards

Record

One instance of data/associated data

i.e. name, email, address

Stored in a page then file

16
New cards

File

Stores pages of records

Needs to keep track of pages in file, first page, next page, free space on pages, etc.

17
New cards

Page

stores records, stored in file

18
New cards

Disk HDD Pros/Cons

Pros:
Cheaper
No write wear out
Larger capacity
Cons:
Slower

19
New cards

Disk Sector

Disk pizza slice

20
New cards

Disk Track

Disk donut slice

21
New cards

Disk Block

Section of disk at (track, sector)

22
New cards

Seek Time

Time for disk arm to move to track

23
New cards

Rotational Latency

Time for sector to rotate around

(End of sector back to beginning of sector so < 1 rot)

24
New cards

Transfer Time

Time to R/W data to a data block

25
New cards

Time to R/W to Disk

Seek time + rotational latency + transfer time

26
New cards

Disk Address of a Page

Surface #, Track #, Sector #

27
New cards

Parallel Disk Storage

Improves R/W times, N pages can be R/W as fast as 1

28
New cards

RAID 0

Data striping over disks

Increased likelihood of a data block failing

29
New cards

Striping

Pages of files split into datablocks across multiple disks

30
New cards

RAID 1

Mirroring

Duplicate datablocks on another disk as backup
Reads can be done in parallel but writes cannot

31
New cards

RAID 0+1

Striping and Mirroring

D reads in parallel

D/2 writes in parallel

Each disk/mirror pair can fail once

Half of total disk capacity must be for mirrors

32
New cards

RAID 4

Parity page for each stripe

Parity page stores redundant info of each page

Parity page must be rewritten with ever page update

D-1 data pages per stripe

33
New cards

Parity Bit

Can tolerate up to one bit failure, otherwise not possible to determine which bit failed in what way

Parity bit must be updated with every page update by XOR-ing old page, new page, and old parity bit

34
New cards

RAID 5

Distributed Parity

Put parity pages on different disks

Won’t fail the whole drive if parity disk fails

Multiple page updates take less time because the wait for a page to finish before updating the parity page on the same disk takes less time than if all parity pages are on one disk

More even data distribution

35
New cards

Buffer Manager

Gets and returns pages from database

36
New cards

Buffer Pool

Divided into page-sized Frames

Pages are brought in to be used and stored in Frames

37
New cards

Frames

Page-sized chunks of memory that can be R/W

38
New cards
39
New cards
40
New cards
41
New cards

Dirty Bit

Signifies whether or not a page has been edited

If true, page must be rewritten to disk before deallocation

42
New cards

Valid Bit

Signifies whether or not a page is “valid”

Can only be 0 if pin count is 0

43
New cards

Replacement Policy

Algorithm used to pick which page in BP to replace upon page fault

Examples: FIFO, MRU, LRU, Clock, Random

44
New cards

Sequential Flooding

Repeatedly having to replace frames that were just evicted

Worst case scenario for LRU

For n pages and n-1 size BP, pages 1 to n-1 are loaded into the BP, being pin/unpinned, then page 1 is evicted in favor of page n, then page 2 must be evicted when page 1 is requested again , 3 evicted upon request for 2, etc.

45
New cards

Clock Replacement

Better replacement algorithm (approx of LRU)

Adds refbit for each page which must be 0 for the page to be replaced

Refbit is set to 1 when pin count is set to 0

Basically figures out which valid, unpinned page can be replaced

Adds clock hand that searches through each page to find a page for replacement in loops

<p>Better replacement algorithm (approx of LRU)</p><p>Adds refbit for each page which must be 0 for the page to be replaced</p><p>Refbit is set to 1 when pin count is set to 0</p><p>Basically figures out which valid, unpinned page can be replaced</p><p>Adds clock hand that searches through each page to find a page for replacement in loops</p>
46
New cards

Implementing LRU and MRU

Not really worth it because an ordered list of historical page access needs to be constantly updated with each change in pin count

Costs time, computing power, memory

47
New cards

Improving Replacement Methods

Prefetching Pages - Replacing pages before they are requested based on historical patterns

Keeping free list empty as often as possible

Clean pages - Don’t wait to replacement to write and deallocate pages

48
New cards

Why not use OS to manage Memory?

Need to PIN a page in memory

Need to force page to be written to disk (part of support for transactions)

DBMS workload has well-known reference patterns, not always good fit with general OS policies

MRU is terrible for general purpose OS

MRU works well for some DBMS access patterns

49
New cards

ER Model

Expressive Model

Diagram showing relationships between Entities, Entity Sets, Attributes

Provides abstraction

50
New cards

Entity

A real world object

An instance of something

51
New cards

Entity Set

A collection of similar items

A set of Entities

52
New cards

Attributes

Set of identifying characteristics

Ex: Age, Height, Weight, Eye Color

53
New cards

Key

Minimal subset of attributes that uniquely id an entity from entity set

Sometimes an ID attribute (ID num) is added just to use as an ID

54
New cards

Candidate Key

Any attribute (or combination of keys?) that could be a key (i.e. unique id)

55
New cards

Primary Key

Designated unique id

56
New cards

Relationship

Association among two or more entities

Can have attributes in ER diagrams (i.e. manage “SINCE”)

Can be represented as a separate table joining two other tables

Grover appears on Sesame Street

57
New cards

Relationship Set

Collection of similar relationships

Muppets appear on TV Shows

58
New cards

Mappings (Relationships)

N-ary relationships

1 to 1

1 to many

Many to many

59
New cards

Key Constraint

An entity entity can participate at most one time in a relationship set

1 x per y (1 to many)

Bold arrow

60
New cards

Participation Constraint

Each entity in an entity set must participate in the relationship

Bold line

61
New cards

Aggregation

Using relationships to group together related records from multiple tables and display as one

Treating a relationship set as an entity set

62
New cards

Relational Database

Collection of Relations

63
New cards

Relation

Schema + TableS

64
New cards

Schema (Relational DB)

Name of relation + fields/domains

65
New cards

Field/Attributes (Relational DB)

Columns in a table

66
New cards

Domain (Relational DB)

Type

67
New cards

Instances/Tuples/Records (Relational DB)

Rows in a table

68
New cards

Cardinality (Relational DB)

Number of rows

69
New cards

Arity/Degree (Relational DB)

Number of columns

70
New cards

Instance of a Relation

Represented by a table

A table shows related records?

The set of attributes grouped together

71
New cards

Instance

Tuple/Record/Row

72
New cards

SQL

Structured Query Language

73
New cards

Integrity Constraint

Condition that must be true for every instance in the relation

74
New cards

Foreign Key

Set of fields of one relation used to refer to tuple in another relation

Must be a candidate key in source (refers to a unique tuple in another relation) (and must exist in source)

75
New cards

Referential Integrity

Foreign key must exist in source relation

76
New cards

ON DEL/UPDT CASCADE (SQL)

Del/Updt field value here too

Change the associated foreign keys or delete the record

77
New cards

ON DEL/UPDT NO ACTION (SQL)

If trying to del/update a record that other records reference with a foreign key, disallow

78
New cards

ON DEL/UPDT SET DEFAULT (SQL)

If trying to del/update a record that other records reference with a foreign key, set the foreign keys to a specified default value

79
New cards

ON DEL/UPDT SET DEFAULT (SQL)

If trying to del/update a record that other records reference with a foreign key, set the foreign keys to a pre-specified default value

80
New cards

View

Relation whose rows and columns are not necessarily stored in the DB

A way of visually organizing, associating, or filtering data together by specific selections to show association

81
New cards

Fixed-Length Records

All records in a page are the same length

All records have same attributes at same offsets, attribute offset info can be stored in meta data

Fixed qty of records per page

82
New cards

Variable-Length Records

Records can differ in length

Fixed number of fields

Field size can vary

Can separate fields with delimiters or stored offsets/pointers

83
New cards

Packed Page (w/ Fixed Length Records)

Page of records where all records are compacted so there is a section of continuous records then continuous free space

Fixed number of records per page based on record size

On delete, the last record is moved to take the deleted records place essentially compacting the records

“Fast and easy insert”

84
New cards

Unpacked Page (w/ Fixed Length Records)

Page of records where records do not compact, and there is a slot directory to keep track of record location and validity

Fixed number of records per page base on record and slot size

Records do not need to be compacted on delete, slot directory just has to update

85
New cards

Heappage

Slotted Page for Variable Length Records

Don’t know capacity of page due to differing record sizes

Slot offsets are not fixed

Requires compaction

Need to maintain offset of each record, length of each record, free space start and end

86
New cards

Double Linked List Heapfile

Two double linked lists for keeping track of pages in the file, one list for full pages, one list for pages with available space

Uses one page as header page with pointers to beginning of both lists and other metadata

More efficient inserts/deletes

87
New cards

Directory of Pages Heapfile

Uses header pages of file to store ordered pointers to individual pages of records, may require multiple header pages

Faster file access if specific page id is known

88
New cards

Heapfile Scan Time Complexity

O(B * D)
B = # of pages, D = I/O Cost

89
New cards

Heapfile Equality Search Time Complexity

O(½ * B * D)
B = # of pages, D = I/O Cost

½ because average number of pages

90
New cards

Heapfile Range Search

O(B * D * N)
B = # of pages, D = I/O Cost, N + # of pages in range

91
New cards

Heapfile Insert Time Complexity

O(2 D) for two double linked lists, O(2 * B * D) for page directory
B = # of pages, D = I/O Cost

2 because header must also be updaed

92
New cards

Heapfile Delete Time Complexity

O(½ * B * D)
B = # of pages, D = I/O Cost

Essentially cost to search then update page and header

93
New cards

Sorted File Scan Time Complexity

O(B * D)
B = # of pages, D = I/O Cost

94
New cards

Sorted File Equality Search Time Complexity

O(D * logB)
B = # of pages, D = I/O Cost

95
New cards

Sorted File Range Search Time Complexity

O(D * logB + N)
B = # of pages, D = I/O Cost, N = # of pages in range

96
New cards

Sorted File Insert Time Complexity

O(D * logB + 2 * (½ * B * D)
B = # of pages, D = I/O Cost

Search for page, insert into page, then shift (reinsert) half the records

97
New cards

Sorted File Delete Time Complexity

O(D * logB + 2 * (½ * B * D)
B = # of pages, D = I/O Cost

Same as insert

98
New cards

DBMS File Layer

Manages DB Relation and Index Files

Acts as a File Interface

Organizes Files

99
New cards

B+ Tree of Auxiliary Indices

B+ tree where search keys of non-leaf nodes are fields of records and leaf nodes contain pointers to records

100
New cards

B+ Tree of Files

B+ tree where search keys of non-leaf nodes are primary keys of records and leaf nodes contain files