Looks like no one added any tags here yet for you.
Data can vary by
scope, format, access
3 software tools to manage database (and what they do):
database management system (reads and writes data)
query (retrieves)
database application (for user interaction)
Requirements for large databases (5):
Performance
Authorization (managed by db admin)
Security
Rules
Recovery
what is a transaction
a group of queries that must either be completed or rejected as a whole
3 rules for processing transactions
Ensure transactions are processed completely or not at all
Prevent conflicts between concurrent transactions
Ensure transactions results are never lost
define the term architecture
describes the internal components and relationships between them
List and define the 3 main architecture components of databases (varies in real life)
Query processor interprets queries, creates a plan to modify the database or retrieve data, and returns query results to the application. Performs query optimization to ensure the most efficient instructions are executed on the data.
Storage manager translates the query processor instructions into low-level file-system commands that modify or retrieve data. Uses indexes to quickly locate data. (access to data must go through SM!)
Transaction manager ensures transactions are properly executed. Prevents conflicts between concurrent transactions and restores the database to a consistent state in the event of a transaction or system failure.
What is a catalog? What is a log?
Catalog - directory of tables, columns, indexes, and other objects. Describes the database, “data dictionary”
Log - a files consisting a complete record of all inserts, updates, and deletes. written by the transaction manager
What is a relational database?
stores data in tables, columns, and rows, similar to a spreadsheet
All relational databases support the Structured Query Language (SQL)
What is NoSQL? Provide an example
newer non-relational systems that are optimized for big data
example: MongoDB - NoSQL- Open source - 5
What are the 3 phases for database design? Include diagram types for each
Analysis phase specifics database requirements (as entities, relationships, and attributes ) without regard to a specific database system
ER Diagram
Logical design - This phase implements database requirements in a specific database system. Converts entities, relationships, and attributes into tables, keys, and columns
Table Diagram
Physical design - This phase add indexes and specifies how tables are organized on storage media
No diagram, specified with CREATE INDEX
NOTE: logical affects query results, but physical only affects processing speed. (data independence)
What is an API and how is it used in database programming
Since db programs are written with SQL and other object-oriented languages (like C++, Java, Python), an API is needed
API – a library of procedures or classes that links a host programming language to a database
Host language calls library procedures, which do things like connect to database, execute queries, and returning results
Who sponsors MySQL
Oracle
What is the MySQL Command-Line Client
MySQL Command-Line Client – a text interface included in the MySQL Server download. It allows developers to connect to the database server, perform admin functions, and execute SQL statements
What is a database model (and its 3 parts)? What is the most common model?
Database model – a conceptual framework for database systems, with 3 parts :
Data structures – prescribe how data is organized
Operations – manipulate data structures
Rules – govern valid data
Relational model – a database model based on tabular structure
Initially designed for transactional data, but has improved their support for big data
Explain the difference between a set and a tuple
Set – an unordered collection of elements enclosed in braces
Ie {a,b,c} is the same as {c,b,a}
Tuple – an ordered collect of elements enclosed in braces
Ie (a,b,c) and (c,b,a) are different
List synonyms for Table, Column, and Row
Table, File, Relation
Row, Record, Tuple
Column, Field, Attribute
In SQL how do you insert a comment?
- - for single line comment
/* for multi- line */
What are the 5 SQL sub-languages?
Data Definition Language (DDL) defines the structure of the database.
Creates, alters, and drops tables (ex CREATE)
Data Query Language (DQL) retrieves data from the database.
Selects data from a table (ex SELECT)
Data Manipulation Language (DML) manipulates data stored in a database.
Inserts, updates, and deletes data in a table (ex INSERT)
Data Control Language (DCL) controls database user access.
Grants and revokes permissions to and from users (ex GRANT)
Data Transaction Language (DTL) manages database transactions.
Commits data to database, rolls back data from db, and creates savepoints (ex COMMIT)
What does SHOW do? Syntax?
SHOW provides users and admin with info about db, its contents, and server status
You can SHOW DATABASES, columns, tables, show create table, etc
first need to use USE to select database
What are the 4 common table rules?
Exactly one value per cell
No duplicate column names
No duplicate rows (often allowed for temp tables)
No row order
What is data independence? Why is it beneficial?
Data independence – row order does not affect query results
This allows users/admin to improve query performance by changing organization of data on storage devices, without affecting query results
List 3 ways to alter a table
ALTER TABLE tablename
ADD columnname data type;
ALTER TABLE tablename
CHANGE currentcol newcol newdatatype;
ALTER TABLE tablename
DROP colname;
Integer data type ranges
TINYINT 1 byte Signed range: -128 to 127
Unsigned range: 0 to 255
SMALLINT 2 bytes Signed range: -32,768 to 32,767
Unsigned range: 0 to 65,535
MEDIUMINT 3 bytes Signed range: -8,388,608 to 8,388,607
Unsigned range: 0 to 16,777,215
INTEGER or INT 4 bytes Signed range: -2,147,483,648 to 2,147,483,647
Unsigned range: 0 to 4,294,967,295
BIGINT 8 bytes Signed range: -2^63 to 2^63 -1
Unsigned range: 0 to 2^64 -1
Syntax for inserting rows into table
INSERT [INTO] tablename (columns,…)
VALUES (values, …);
Syntax for Updating rows
UPDATE tablename
SET col1=value1, col2=val2,…
WHERE condition;
Syntax for deleting rows
DELETE FROM tablename
WHERE condition;
What is a primary key, its two types, and its requirements? Column or table constraint?
PK - a column, or group of columns used to identify a row
types : single PK (one column) or composite PK (multiple columns)
Requirements:
unique
Not NULL
minimal (for composite)
Table constraint
What is a foreign key and its requirements? Syntax?
FK - a column or group of columns that refer to a primary key (must be same data type)
Requirements:
Referential integrity - foreign keys must either be NULL or match some value of the reference primary key
(table constraint)
FOREIGN KEY (FKcol) REFERENCES (PKcol)
What is referential integrity
all FK values must match primary or be fully NULL
What are the 4 common constraints used for referential integrity violations
RESTRICT rejects an insert, update, or delete that violates referential integrity.
SET NULL sets invalid foreign keys to NULL.
SET DEFAULT sets invalid foreign keys to the foreign key default value.
CASCADE propagates primary key changes to foreign keys.
How do you update or delete foreign and primary keys?
ON UPDATE and ON DELETE clauses
For foreign key updates and deletes, mySQL supports RESTRICT
For primary, mySQL supports all (RESTRICT, SET NULL, SET DEFAULT, CASCADE)
Actions are specified in ON UPDATE and ON DELETE clauses of FORIEGN /PRIMARY KEY constraint
What are the purpose of the UNIQUE and CHECK constraints?
UNIQUE ensures values in a column, or group of columns, are unique
CHECK constraint that specifies an expression on one or more columns (either a col or table constraint)
How do you add or drop constraints?
In ALTER TABLE, use ADD, DROP, or CHANGE
ALTER TABLE tablename
DROP CHECK constraintname;
ALTER TABLE
ADD CONSTRAINT constraintname PRIMARY KEY (column);
Does GROUP BY appear before or after ORDER BY
Before
Appears between WHERE and ORDER BY (if any)
What is the HAVING clause for?
Used with GROUP BY to filter group results
HAVING COUNT(*) > 1 : a having clause that selects only groups with more than one row count
What is an INNER JOIN
INNER JOIN
selects only matching left and right table rows.
INNER
is optional.
appears between FROM and ON, followed by optional WHERE
What is ON used for in a join
specifies which columns to join on
What is a FULL JOIN
selects all left and right rows regardless of match
is not supported by mySQL
What is a LEFT JOIN? What is a RIGHT JOIN
LEFT JOIN selects all left table rows, but only matching right table rows.
RIGHT JOIN selects all right table rows, but only matching left table rows.
Since MySQL doesn’t support FULL JOIN, what can be used instead?
UNION - combines the two results into one table
Whats the difference between an equijoin and a non-equijoin?
equijoin
An equijoin compares columns of two tables with the = operator.
non-equijoin
A non-equijoin compares columns with an operator other than =, such as < and >.
What’s a cross join?
A cross-join combines two tables without comparing columns, aka no ON clause
What is a materialized view?
A materialized view is a view for which view table data is stored at all times.
must be refreshed when data is changed
What is WITH CHECK OPTION used for?
used in CREATE VIEW, it rejects and generates error when inserts and updates do not satisfy the view query WHERE clause
What is the Entity-Relationship Model? What are the 3 objects included?
An entity-relationship model is a high-level representation of data requirements, ignoring implementation details.
Includes 3 kinds of objects:
An entity is a person, place, product, concept, or activity.
A relationship is a statement about two entities.
An attribute is a descriptive property of an entity.
What are the two parts of an ER Model
ER Diagram (Entity Relationship Diagram) - schematic picture of entities, relationships, and attributes
Glossary – also known as data dictionary or repository, documents additional details in text form
What is a type? How are they usually implemented?
In entity-relationship modeling, a type is a set:
An entity type is a set of things.
A relationship type is a set of related things.
An attribute type is a set of values.
Entity, relationship, and attribute types usually become tables, foreign keys, and columns, respectively.
What are instances? How are they implemented?
An instance is an element of a set:
An entity instance is an individual thing.
A relationship instance is a statement about entity instances.
An attribute instance is an individual value.
Entity, relationship, and attribute instances usually become rows, foreign key values, and column values, respectively
What are the 3 phases of Database Design in relation to ER Models?
Analysis develops an entity-relationship model, capturing data requirements while ignoring implementation details.
Logical design converts the entity-relationship model into tables, columns, and keys for a particular database system.
Entities become tables, relationships become keys (sometimes tables), and attributes become columns
Physical design adds indexes and specifies how tables are organized on storage media.
What are the first 4 steps in database design (analysis phase)
discover entities, relationships, and attributes
determine cardinality
distinguish strong and weak entities
create supertype and subtype entities
What are steps 5-9 of database design (logical)
Implement entities
Implement relationships
Implement attributes
Apply Normal form
What is cardinality?
Cardinality – refers to maxima and minima of relationships and attributes
Explain cardinality syntax for relationships and attributes
Max (Min)
either in relationship line or by attribute
for mins - 1 is required, 0 optional
NOTE Unique notation appears before max ( U-Max(Min))
What is an identifying attribute?
An identifying attribute is unique, singular, and required. Identifying attribute values correspond one-to-one to, or identify, entity instances.
example : ProjectNumber 1-1(1)
What is a strong entity?
A strong entity has one or more identifying attributes. When a strong entity is implemented as a table, one of the identifying attributes may become the primary key.
What is a weak entity?
A weak entity does not have an identifying attribute. Instead, a weak entity usually has a relationship, called an identifying relationship, to another entity, called an identifying entity. Cardinality of the identifying entity is 1(1).
A weak entity can also be identified by another (or multiple) weak entities
What are Supertype and Subtype entities?
A subtype entity is a subset of another entity type, called the supertype entity
A supertype entity identifies its subtype entities. The identifying relationship is called an IsA relationship
What is Crow’s foot notation?
crow's foot notation - depicts cardinality as a circle (zero), a short line (one), or three short lines (many)
What are 3 best practices for selecting primary Keys?
stable: value should not change.
simple: easy to type and store.
meaningless: no descriptive information.
What is functional dependency?
Dependence of one column on another
Column A depends on Column B is denoted B->A (basically B is associated with 1 A)
Each value of B relates to at most one value of A
What is redundancy, when does it occur, and what design rules are used to lower redundancy?
Redundancy is the repetition of related values in a table.
Redundancy occurs when a dependence is on a column that is not unique
Normal forms are rules for designing tables with less redundancy.
What is first normal form (1NF)?
A table is in first normal form when each cell contains one value and the table has a primary key. This definition has two corollaries:
In a first normal form table, every non-key column depends on the primary key.
A first normal form table has no duplicate rows.
What is second normal form?
A table is in second normal form when all non-key columns depend on the whole primary key. In other words, a non-key column cannot depend on part of a composite primary key or a candidate key
What is third normal form?
Redundancy can occur in a second normal form table when a non-key column depends on another non-key column. Informally, a table is in third normal form when all non-key columns depend on the key, the whole key, and nothing but the key
What is Boyce-Codd Normal Form?
Every attribute should be dependent on the key, the whole key, and nothing but the key
Boyce-Codd normal form eliminates all redundancy arising from functional dependence
What is Normalization and what are the 3 steps to normalize tables?
Normalization eliminates redundancy by decomposing a table into two or more tables in higher normal form
List all unique columns. U
Identify dependencies on non-unique columns. Non-unique columns are either external to all unique columns or contained within a composite unique column.
Eliminate dependencies on non-unique columns by creating new tables