1/52
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
defines how a database system organizes, manages, and accesses data physically on storage devices, while also providing logical views to users and applications
Data Storage Architecture
is organized logically as a sequence of records.
File
are provided as a basic construct in operating systems, so we shall assume the existence of an underlying file system
File
also logically partitioned into fixed-length storage units.
File Organization
are the units of both storage allocation and data transfer.
Blocks
Each record fits entirely within one block.
File Organization
Store record i starting from byte n * (i – 1), where n is the size of each record
Fixed Length Records
do not allow records to cross block boundaries
Fixed Length Records
Occur when records have fields of different sizes (e.g., strings).
Variable-Length Records
Also caused by repeating fields (arrays, multisets).
Variable-Length Records
Can result from multiple record types in one file.
Variable-Length Records
whose structure is the same for all records of the same relation
fixed length
holds the content
variable length
are allocated as many bytes as required to store their value.
Fixed-length attributes
are represented in the initial part of the record by a pair (offset, length)
Variable-length attributes
denotes where the data for that attribute begins within the record
offset
is the length in bytes of the variable-sized attribute.
length
is commonly used for organizing records within a block
slotted-page structure
may be stored either as files in a file system area managed by the database, or as file structures stored in and managed by the database
Large objects
Any record can be placed anywhere in the file where there is space for the record.
Heap file organization
Records are stored in sequential order, according to the value of a “search key” of each record
Sequential file organization
records from different relations are stored in the same file or block.
Multitable clustering file organization
educes join operation costs.
Multitable clustering file organization
The traditional sequential file organization does support ordered access even if there are insert, delete, and update operations, which may change the ordering of records
B+-tree file organization
is computed on some attribute of each record.
Hashing file organization
specifies in which block of the file the record should be placed
Hashing file organization
Once placed in a particular location, the record is not usually moved.
heap file organization
track which blocks have free space to store records.
free-space map
is commonly represented by an array containing 1 entry for each block in the relation
free-space map
is designed for efficient processing of records in sorted order based on some search key.
Sequential file
is any attribute or set of attributes; it need not be the primary key, or even a superkey
search key
is a file organization that stores related records of two or more relations in each block
multitable clustering file
is the attribute that defines which records are stored together
cluster key
is typically done on the basis of an attribute value
Table Partitioning
data about data
metadata
Relational schemas and other metadata about relations are stored in a structure
Data Dictionary or System Catalog
is that part of main memory available for storage of copies of disk blocks.
buffer
responsible for the allocation of buffer space is called the buffer manager.
subsystem
in which the block that was referenced least recently is written back to disk and is removed from the buffer
LRU
memory block that is not allowed to be written back to dis
Pinned Block
frees the space occupied by a block as soon as the final tuple of that block has been processed
Toss-immediate strategy
system must pin the block currently being processed. After the final tuple of that block has been processed, the block is unpinned, and it becomes the most recently used block
MRU strategy
can use statistical information regarding the probability that a request will reference a particular relation
Buffer Manager
support forced output of blocks for the purpose of recovery
Buffer Managers
write buffers speed up disk writes by writing blocks to a non-volatile RAM or flash buffer immediately
Non volatile
a disk devoted to writing a sequential log of block updates
Log Disk
Used exactly like nonvolatile RAM
Log Disk
each attribute of a relation is stored separately, with values of the attribute from successive tuples stored at successive positions in the file
Column-oriented storage
Reduced IO if only some attributes are accessed Improved CPU cache performance Improved compression Vector processing on modern CPU architectures
Column-oriented storage
found to be more efficient for decision support than row-oriented representation
Columnar
Can store records directly in memory without a buffer manager
Main Memory Databases
can be used in-memory for decision support applications
Column-oriented storage
reduces memory requirement
Compression