1/61
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Database
A collection of logically interconnected data
Database Management System (DMBS)
Software system that is used to store and query the information in a database
Drawbacks of using file systems such as .csv to store data (7)
Data redundancy and inconsistency
Difficulty in accessing data
Data isolation
Integrity problems
Atomicity of updates
Concurrent access by multiple users
Security problems
Transaction
A collection of operations that performs a single logical function in a database application
Foreign key
Set of attributes that point to a candidate key of another table or have NULL values
Inner join
Returns records that have matching values in both tables
Left join
Returns all records from the left table, and the matched records from the right table
Right join
Returns all records from the right table, and the matched records from the left table
Full join
Returns all records when there is a match in either left or right table
Set operations in SQL (3)
union
intersect
except
Aggregate functions in SQL (5)
avg
min
max
sum
count
Attributes in the select clause outside of aggregate functions must appear in the _ list
GROUP BY
HAVING clause
Used to filter groups of rows based on conditions applied to aggregate functions. Used in conjunction with the GROUP BY clause
Typed order of SQL clauses (6)
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
Executed order of SQL clauses (6-7)
FROM
WHERE
GROUP BY
HAVING
SELECT
DISTINCT (if applicable)
ORDER BY
Subquery
select-from-where expression that is nested within another query
When to use subqueries in the WHERE clause (3)
For set membership
For set comparisons
For set cardinality
When to use a subquery in the FROM clause
To create a temporary dataset from existing tables
View
Virtual table that represents the result of a stored SQL query
Materialized view
The view gets updated whenever the underlying relations are updated
Indexes
Data structures used to speed up access to records with specified values for index attributes
JDBC
Java API that connects java applications to DBMSs
Steps to have JDBC communicate with DB (4)
Open a connection
Create a "statement" object
Execute queries using the statement object to send queries and fetch results
Exception mechanism to handle errors
Super key
A set of one or more attributes whose values uniquely determine each entity
Candidate key
Entity set that loses its uniqueness when an attribute is removed
Primary key
Unique identifier for each record in a table
Weak entity set
An entity set that does not have a primary key
Discriminator (partial key)
The set of attributes that distinguishes among all the entities of a weak entity set
Aggregation in E-R diagrams
A way to treat a relationship between entities as a higher-level entity itself, allowing that relationship to participate in other relationships
Functional dependency theory
Defines constraints where one attribute (determinant) uniquely determines another (dependent), forming the basis for data integrity, consistency, and normalization by organizing data to avoid anomalies and improve database design by identifying keys and relationships
Volatile storage
Loses contents when power is switched off
Non-volatile storage
Contents persist even when power is switched off
Includes secondary and tertiary storage, as well as batter-backed up main-memory
Factors affecting choice of storage media include (3)
Speed with which data can be accessed
Cost per unit of data
Reliability
Redundant arrays of independent disks (RAID)
Disk organization techniques that manage a large number of disks, providing a view of a single disk of high capacity, high speed, and high reliability
Factors in choosing RAID level (4)
Monetary cost
Performance
Performance during failure
Performance during rebuild of failed disk
Disk block
Logical unit of storage allocation and retrieval
Heap (organization of records in files)
Record can be placed anywhere in the file where there is space
Sequential (organization of records in files)
Store records in sequential order, based on the value of the search key of each record
Hashing (organization of records in files)
A hash function computed on some attribute of each record; the result specifies in which block of the file the record should be placed
Data dictionary (also called system catalog)
Stores data such as:
Information about relations
User and accounting information, including passwords
Statistical and descriptive data
Physical file organization information
Information about indices
Block
Unit of both storage allocation and data transfer
Buffer
Portion of main memory available to store copies of disk blocks
Buffer manager
Subsystem responsible for allocating buffer space in main memory
Dense index
Every search key value has an entry in the index
Sparse index
Only some search key values have an entry in the index
Primary index
Key of the index is the attribute of the primary key of the table
Secondary index
Key of the index is the attribute of a non-primary key of the table
Difference between B-tree and B+-tree
B+-trees store all keys and data in leaf nodes, while B-trees store each value only once
B+-trees leaf nodes are connected via a linked list
Bucket
Unit of storage containing one or more records (typically a disk block)
Overflow chaining
The overflow buckets of a given bucket are chained together in a linked list
Factors considered when deciding on a file organization for a table (6)
Types of queries on the table
Frequencies of queries
Access time
Inserting time
Deletion time
Space overhead
ACID
Atomicity, Consistency, Isolation, Durability
Atomicity
the all-or-nothing property: either the transaction completes successfully or nothing at all
Durability
The updates to the database by the transaction must persist even if there are software or hardware failures
Isolation
The system must run transactions providing the illusion that each transaction is the only one running in the system
2 parts of recovery algorithms
Actions taken during normal transaction processing to ensure enough information exists to recover from failures
Actions taken after a failure to recover the database contents to a state that ensures atomicity, consistency, and durability
Deferred update
Only make changes to disk after all log entries and the commit are in the log
Immediate update
Entries on disk can be changed before the transaction commits
REDO
Scan the log from the beginning to the end
UNDO
Scan the log from the end backwards
Recovery from a deferred update failure
REDO on committed transactions
Recovery from an immediate update failure
REDO on committed transactions
UNDO on uncommitted transactions