1/73
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
CPU Registers
Small, extremely fast storage inside the CPU. Temporarily holds data or instructions currently being used by the CPU.
Cache memory
A small, high-speed memory located close to the CPU.
Stores frequently accessed data or instructions to reduce latency and improve performance
RAM (Main memory)
Volatile. Stores data for programs that are currently open
SSD
storage that uses flash memory to store data, with no moving parts. Uses flash storage.
Hard (magnetic) drive
storage device that uses spinning disks (platters) to store data magnetically. (photos, files)
Tape/ Optical Disk
very large, non-volatile, sequential-access storage used for archiving data or backups. (blu ray disks)
Primary Storage
Storage directly accessible by the CPU that holds temporary information for computations. (CPU registers, cache memory/RAM)
Secondary Storage
Storage that is remote to the CPU and permanently holds data, even when the PC is turned off. ( SSD, Hard drive)
Teritary Storage
High-capacity external storage used mainly for back ups (Tape, optical disk)
Volatile Storage
Storage which loses its contents when the power is removed
NonVolatile Storage
Storage that retains its data after the power is turned off
Annualized Failure Rate
Percentage of drives failing in a year. Estimated AFR for HDDs 1-3 years old is around 2%.
SSD Pros
no mechanical failures (non moving)
good read -consume less space
SSD cons
erase then write, writing is slower then reading
number of erasures limited, 0.1M-M
more expensive then HDD
Main sources of R/W delay
Seek Time (wait to move head to track)
Rotational Latency (time for sector to rotate to r/w head)
Transfer time (r/w data)
RAID
Redundant Array of Independent Disks (multiple hd or ssd into single unit)
RAID : Disk Mirroring
creating two copies of data on 2 or more drives
RAID : Disk Mirroring PROS
can survive failed drive since we have multiple -Maybe parallel reading
RAID : Disk Mirroring CONS
2 vs 1 drives, cost, power
RAID : Disk Stripping
Data divided into blocks written sequentially across multiple drives. Block #/# of drives
RAID : Disk Stripping PROS
parallel read and write
RAID : Disk Stripping CONS
increase chance of system failure
no data replication
RAID : Parity bits
uses extra bits to recreate lost data in a failed drive
(even, odd, mark, none, # of 1's)
RAID : Parity bits PROS
can rebuild failed drives
RAID : Parity bits CONS
need parity drive
cannot recover from 2 or more failures
Demorgans law for sets
not(A U B) = notA intersect notB
RAID Level 0
Striped Volume N data disks Data striping, no parity or mirroring. Cannot recover from failed drive. Good Write.
RAID Level 1
disk mirroring, opportunity for parallel reads, need 2x disks
RAID Level 5 Block Interleaved Distributed Parity
Combines stripping with parity, parity drives are across all drives, stripping is one block
RAID Level 6
RAID 5 add a second parity scheme, VS RAID 5 can recover from 2 stimutaneous drive failures, same read slower write
Blocking Factor (bf)
records per block, ⌊block size /record size⌋
Internal Fragmentation
unallocated memory within a block. block size - (records per block * record size)
Packed (Contiguous) Allocation
For fixed-length records. Records are stored one after the other. Store the quantity of records at the end
Unpacked Allocation
For fixed-length records. Records are stored anywhere and a bitmap
Variable-Length Record Allocation
similar to unpacked, except bitmap is a slot directory holding starting byte position and length
Primary Index
An index created on a chosen candidate key where both the index and the database are sorted on that key. Only 1 per file
Secondary Index
An index created on any field where only the index is sorted on that key. Can have as many as you want. Index points to linked list of pointers that each point to corresponding record.
Clustered Index
An index created on a chosen secondary key where both the index and the database are sorted on that key. Index points to either a record or blocks. Can have either a primary or a clustered index per file.
Secondary Index point to what
linked list of pointers that each point to corresponding record
Clustered Index point to what
records or blocks
Dense Indices
An index that contains an entry for every record
Sparse Indices
An index that doesn't contain an entry for every record. Each key points to a whole block. If searching for something between two keys, have to search the block. Performance of contains() suffers.
Only index small subset of field values
Dynamic Hashing
Tree directory
Leaves are blocks holding index entries
Insert keys and dynamically split blocks when needed to resize
Types of keys: Canidate Key
a key able to uniquely identify a record (UAID)
Types of keys: Primary Key
The chosen canidate key
Types of keys: Secondary Key
Any non Canidate key
Types of keys: Sort Key
A key used to order records in the data base
Classification of Indexes: Ordered
Single level, Multi-level (sorted file, b+ tree)
Classification of Indexes: Unordered
hashing (extendible/dynamic)
Dynamic hashing
tree directory
leaves are blocks holding entries
insert keys and dynamic resize
Extendible hashing
Directory is array
Extendible hashing advantage
only adjust directory size instead of restructuring and rehashing
Extendible hashing: Local Depth
each index block has it
bigger than 1
Looks at first bit
Extendible hashing: global depth
a directory has 1, looks at first bit
Extendible hashing: directory size
k^g
k is alphabet set number, binary is 2 g is global depth
Extendible hashing: when do buckets split?
buckets only split if l<g, else directory doubles
Extendible hashing: Deletion if you have disk space
delete index entry and keep blocks for future insertion
Extendible hashing: Deletion if you don't have disk space
-merge sibling blocks -shrink directory when merging if able
B-Tree:
a self-balancing search tree used in databases and file systems to organize and store data for efficient searching, insertion, and deletion.
B-Tree: Structure
A node has n keys and n+1 pointers -Dense -Ascending order
B-Tree of Order M (D.comer)
-Each node has at most 2M keys -at most 2M+1 Pointers
at least M keys per node
Non leaf has at least 2 children
balanced (leaf nodes same level)
a non leaf node with n keys has n+1 children
B-Tree of Order M: Insertion
Insert in order
if full (2M) split into 2 nodes with median as parent
B-Tree of Order M: Deletion if underfull node is leaf and Sibling has extra keys
move separating parent to node and move smallest or largest from neighbor to parent
B-Tree of Order M: Deletion if underfull node is leaf and sibling don't have extras
Merge with sibling by adding node, neighboring sibling and a parent
B-Tree of Order M: Deletion if underfull node is an internal node
replace deleted key with its inorder predecessor or successor
recurse if necessary
Whats inorder Predecessor
Largest key from left subtree
Whats inorder Sucessor
smallest key from right subtree
Capacity of a B-tree: How many keys in a B tree with M=128 and has 3 levels (height of 2)
Level 0= 256 keys (2M) Level 1= 257 children, each with 256 keys = 65,792 keys Level 2 = 257^2 children, each with 256 keys
Capacity of a B-tree: How many keys in a B tree with M=3 and has 3 levels (height of 2)
Level 0= 6 keys, 2M, 7 Pointers Level 1 = 7 children with 6 keys each = 42 keys Level 2 = 7^2 children with 6 keys each, 294 keys
How to determine B-tree order with block of 8k, 12 keys and 8 ptrs
2m Keys, 2M+1 pointers (2M)*12 + (2M+1)*8 = 8192
B+ Tree
A B+ Tree is a self-balancing search tree used in databases and file systems. It stores all keys in the leaf nodes and provides efficient sequential access with linked leaf nodes.
B+ Tree Vs B-Tree Structure
Each Key is stored in Leaf nodes
copies of some key values occupy internal nodes
leaf nodes are linked sequentially to form sequence set
Keep copy of parent and also has arrow
B+ Tree PROS
Built in pointers for keys in leaf nodes
support exact match and range
B+ Tree CONS
bit of wasted storage in upper levels
insert and delete more complex