Databases - Exam 3

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/98

Earn XP

Description and Tags

Professor Dudenhoffer

Last updated 2:31 AM on 4/26/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

99 Terms

New cards

Databases are stored in a collection of files. Each file is a sequence of records and a record is a sequence of fields.

How do files, fields, and records relate to how databases are stored?

New cards

4 kb

How much storage does a single node have?

New cards

100 keys/pointers

How many keys or pointers does a single block hold?

New cards

1 block

How many blocks can be held in a single node?

New cards

n * (i - 1)

For fixed-length records, what equation is used to store the record i?

New cards

Move records i+1 until n-1

move record n to i

do not move the records, link all unused records on a free list

Briefly describe the different ways to delete record i from fixed length record model. Where records is i and the total number of records is n.

New cards

A list of pointers that point to the location of a free record location.

How would you simply describe a free list?

New cards

Slotted page

This type of variable length record structure contains the number of record entries, end of free space in the block, and the location/size of each record. Records are moved around so there is no empty space between them and pointers point to the entry for the record in the header.

New cards

Heap

This allows a record to be placed anywhere in the file where there is space available.

New cards

Squential

This allows a record to be stored in a specific order based on the value of the record’s search key.

New cards

Hashing

This utilizes a computed function onto an attribute of a record. The result of the function determines where the record is placed.

New cards

Multitable clustering file organization

This allows records of several different relations to be stored in the same file and minimizes I/O time. This is unlike the other organization of records in files, where the records of each relation are stored in different files.

New cards

sequential processing, search-key, pointer chains, insertion (locate where record should be inserted, if there’s space proceed and update pointer chain, else insert into an overflow block and update pointer chain.)

For sequential file organization:

What type of processing is it best used for?
How are the records stored/ordered by?
What do you use when deleting a record?
How do you insert?

New cards

Queries involving the joining of two relations, queries involving only one relation

For multitable clustering file organization

What types of queries is it useful for
What types of queries would make it useless?

New cards

Data dictionary (AKA system catalog)

This type of storage stores metadata (data about data) such as…

information about relations (relation names, types,
user and accountability information (such as passwords)
statistical and descriptive data
physical file organization information (how it’s stored/physical location of relation)
information on indices

New cards

Blocks

What are the fixed-length storage units called that a file is partitioned into?

New cards

Buffer

This portion of main memory is available to store copies of disk blocks.

New cards

Buffer manager

This subsystem is responsible for allocating buffer space in main memory.

New cards

When a block from disk is needed. If the block is in the buffer the manager returns the address of block in main memory. If it is not in the buffer, the manager will allocate space for it and then read the block from the disk to the buffer and return the address.

When do programs call on buffer managers? How does it work?

New cards

Least recently used

What is the main strategy for buffer replacement?

New cards

Pinned block, Toss Immediate, Most recently used

These are different buffer replacement policies:

The main memory block isn’t allowed to be written back to disk.
As soon as the final tuple of the block was processed, the space is freed
- The block currently being processed is pinned to the top.

New cards

Search Key

This is an attribute or set of attributes used to look up a particular record in a file.

New cards

Search Key Value

This is the value that is being searched for in a file.

New cards

Index entries

What are the records in an index file called?

New cards

Index file

This type of file is smaller than the original file and only hold the search key value and pointer.

New cards

Ordered indices

In an index file, this type of set up mean the indexes are stored in a sorted order.

New cards

Hash indices

In an index file, this type of set up means the search key values are distributed uniformly across buckets using a function.

New cards

Access time, insertion time, deletion time

In regards to index evaluation metrics:

The time to retrieve a record
The time to add a new record, taking into account the time to update the index
The time to remove a record, taking into account the time to update the index.

New cards

Primary index (clustering index)

When records are stored in sorted order of one of the attributes, this type of index will point to the attribute that the table is sorted by.

New cards

Secondary index (non-clustering index)

When records are stored in sorted order of one of the attributes, this type of index will point to an attribute that the table is not using to be sorted. This type of index must be Dense.

New cards

Dense index

This is when every search-key value is represented in the index.

New cards

Sparce index

This is when not all search-key values are represented in the index and only apply to the primary indices.

New cards

Treat the primary index as a sequential file and construct a sparce index on it. Called multi-level index.

What should we do if a primary index does not fit in memory and access becomes expensive?

New cards

outer index

When treating primary indices as a sequential file, this is the sparce index of a primary index.

New cards

inner index

When treating primary indices as a sequential file, this is the primary index file.

New cards

Same as file record deletion (3 methods)

For a single level index entry with dense indices, how would you perform deletion?

New cards

If the entry for the search key exists in the index, delete it and check if the next search-key value is in the index. If it is not in the index, replace the deleted entry with the next search-key value. If it is already in the index, replacing isn’t necessary.

For a single level index entry with sparce indices, how would you perform deletion?

New cards

Using the search-key value, look to see if it’s already in the index. If it’s not, insert.

For single level indexes, how does insertion work for dense indices?

New cards

Using the search-key value, look to see if it’s already in the index. If the index stores an entry for each block of the file, no change needed unless a new block is created. If the new block is created, then the first search-key value appearing in the new block is inserted into the index.

For single level index, how does insertion work for sparce indices?

New cards

Reorganizes itself with small changes

What is the main advantage of using a B+ tree?

New cards

space overhead for insertion and deletion.

What is the main disadvantage of B+ trees?

New cards

Performance degrades when the file gets bigger and reorganization is required periodically. B+ trees

What are the disadvantages of index sequential files? What’s the alternative?

New cards

Performs small reorganization on its own/doesn’t need to be completely reorganized to maintain performance. Insertion/deletion overhead.

What’s the advantage of using B+ trees? What’s the main (minor) disadvantage?

New cards

All paths from root to leaf are same length

Each node (not root/leaf) has between n/2 and n children.

A leaf node has between (n-1)/2 and n-1 values

What properties must a B+ tree satisfy?

New cards

Log_n/2(K)

The height of a B+ tree is no more than what equation?

New cards

Find which leaf node the search-key value would appear

If value is already there: add record to file and pointer to the bucket

If value isn’t there: add record to main file (create bucket if needed)

If there is room in leaf node, insert. Otherwise, split node and place first n/2 in original and the rest in new node. Propagate upwards if necessary

How do you insert in a B+ tree?

New cards

Extendable hashing

This is a type of dynamic hashing that generates values over a large range of 32-bit.

New cards

2ⁱ

Assume that for extendable hashing, the length of the prefix is i bits. How can we determine the bucket address table size?

New cards

Locate the bucket using hash function, check if there’s room. If there is room insert, if there isn’t room split and insert.

How do you insert into an extendable hash structure?

New cards

extra level of indirection to find record, address tables may become very large, changing size of bucket address table is expensive

What are the 3 main disadvantages of extendable hashing?

New cards

Linear hashing, B+ trees

What are the two alternatives of extendable hash tables?

New cards

better option: Retrieving records that have value of key

hindrance: Range queries

When is hashing the better option? When is it more of a hindrance?

New cards

PostgreSQL

This type of database software supports hash indices, but discourages use due to poor performance.

New cards

Oracle

This type of database software supports static hash organization, but not hash indices.

New cards

SQLServer

This type of database software supports only B+ trees.

New cards

create index, create unique index, drop index

What are the three main SQL index commands? (Mentioned in Chapter 11)

New cards

To find the evaluation plan with the lowest cost

What is the main goal of query optimization?

New cards

Cost

This is the elapsed time to answer the query.

New cards

Disk access time

What is the most driving factor of query cost?

New cards

BT + Ss

T is the time to transfer one block. S is the time for a single seek. How do we calculate the cost for block transfers B plus seeks S?

New cards

Linear Search (A1)

This algorithm is considered the “file scan” algorithm when it comes to finding the cost for a selection operation. It will scan each file block and test all records to see whether they satisfy the selection condition.

New cards

b_r + 1s

What is the cost estimate for a selection operation using linear search?

New cards

primary index, equality on key

This algorithm is considered one of the two “index scan” algorithms when it comes to selection operations using indices. It retrieves a single record that satisfies the corresponding equality condition.

New cards

(h + 1) * (T + S)

What is the cost for a primary index, equality on key algorithm for selections using indices?

New cards

primary index, equality on nonkey

This algorithm is considered one of the two “index scan” algorithms when it comes to selection operations using indices. It retrieves multiple records.

New cards

h (T + S) + S + T * B

What is the cost equation for the primary index, equality on nonkey algorithm for selections using indices?

New cards

Nested Loop Join

This algorithm helps to estimate cost for a join operation. It compares every tuple of one relation (outer) with every tuple of another relation (inner).

For each tuple in the outer relation, it will scan all the tuples in the inner relation.

New cards

Take the total number of tuples in the outer relation and multiply it by the total number of blocks in both relations. Add that to the number of seeks (total number of tuples in outer added to the total blocks of outer).

How do you calculate the worst case cost for the nested-loop join algorithm? Make sure to take into account block transfers and seeks.

New cards

Take the total number of blocks in both relations and two seeks

What is the best case cost for the nested-loop join algorithm?

New cards

If the smaller relation fits entirely in memory, use that as the inner relation.

What is a rule of thumb to remember when estimating cost using the Nested-Loop Join algorithm? Hint: this has to do with choosing which relation is inner and which is outer.

New cards

Block Nested-Loop Join

In regards to estimating the cost of the join operation, this algorithm improves another algorithm by comparing blocks of tuples instead of one tuple at a time.

For each block of the outer relation, scan all the blocks of the inner relation.

New cards

Take the number of blocks in the outer relation and multiply that to the sum of all the blocks (both outer and inner). Then add it to double the amount of blocks from the outer relation.

(b_r * b_s + b_r) + 2*b_r

How do you calculate the worst case estimate for the block nested-loop join algorithm?

New cards

Sum up the total number of blocks in both relations and add it to 2 seeks

How do you calculate the best case estimate for the block nested-loop join algorithm?

New cards

Indexed Nested-Loop Join

In regards to estimating the cost of the join operation, this algorithm uses an index on the inner relation to avoid scanning it fully.

For each tuple in the outer relation, use an index on the inner relation to find matching tuples.

New cards

Sum up the cost for one block transfer and one seek, multiply that by the total number of blocks in the outer relation. Add that to the number of tuples in the outer relation and then multiple all of it by the cost of an index lookup on the inner relation.

How do you estimate the worst case cost using the index nested-loop join algorithm?

New cards

Choose the relation that doesn’t use indices.

If both relations use indices, make the relation with the lesser number of tuples be the outer relation.

What is the general rule of thumb for choosing the outer relation for the Indexed Nested-Loop Join algorithm?

New cards

Transaction

This is a unit of program execution that accesses and possibly updates various data items.

New cards

Atomicity, Consistency, Isolation, Durability

What are the properties of ACID?

New cards

Atomicity

This is a property of ACID ensures a system can go back to its original state in the case that a transaction didn’t finish.

New cards

Durability

This is a property of ACID that ensures updates continue even if it fails the first time. Eventually, it will finish.

New cards

Consistency

This property of ACID ensures the transaction sees a database that won’t change unexpectedly.

New cards

Explicit and implicit integrity constraints

What are the requirements for the consistency property in ACID transactions?

New cards

primary key or foreign keys

sum of A and B unchanging after transaction execution

In regards to the Consistency requirements in ACID properties, what is an explicit integrity constraint? What about implicit?

New cards

Isolation

This property of ACID ensures that multiple concurrent processes can happen at the same time.

New cards

Recovery system, Programmers, Concurrency control system, recovery system

Who is responsible for the ACID properties? Go in this order: Atomicity, Consistency, Isolation, Durability.

New cards

Active, partially committed, failed, aborted, committed

To help think about Atomicity and Durability, what are the 5 states of the transaction state diagram?

New cards

Active

In the transaction state diagram, this state is where the transaction stays while it’s executing.

New cards

Partially committed

In the transaction state diagram, this state is where the transaction is after the final statement has been executed.

New cards

Failed

In the transaction state diagram, this state is where the transaction lands after the discovery that normal execution can no longer proceed.

New cards

Aborted

In the transaction state diagram, this state rolls back the transaction and restores the database to its prior state before the start of execution. From there, the transaction can either be restarted or killed.

New cards

Committed

In the transaction state diagram, this state means execution was successful and finished.

New cards

Logical error, System error, System Crash, Disk failure

What type of failure is it?

A transaction cannot complete because of some internal error condition.
An active transaction is stopped due to an error condition such as deadlock.
A power failure or some other hardware/software failure.
A head crash or similar destroys all or parts of a disk storage

New cards

Fail-Stop Assumption

This claims that non-volatile storage contents are assumed to not be corrupted as a result of system crash.

New cards

Redo and Undo phase

Action taken during the normal transaction processing, action taken after a failure to recover.

What are the two parts of a recovery algorithm?

New cards

Volatile, non-volatile, stable

What are these storage structures?

Doesn’t survive system crashes (RAM)
Survives a system crash but may still fail and lose data (disk)
Perfect storage that survives all failures.

New cards

Maintain multiple copies of each block on separate disks

How is stable storage implemented?

New cards

successful completion, partial failure - destination block is updated but incorrect, total failure - destination block was never updated

What are the 3 ways a block transfer can result in? Give a brief description of each.

New cards

Write the information onto first block. Once complete write same info onto second block. Once complete, the output is complete.

What are the 3 steps to attempt to protect storage media from failure during data transfer?

New cards

In the case that