1/148
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
database application
software that helps business users interact with database systems
database administrator
responsible for securing the database system against unauthorized users.
enforces procedures for user access and database system availability.
query processor
interprets queries, creates a plan to modify the database or retrieve data, and returns query results to the application.
performs query optimization to ensure the most efficient instructions are executed on the data.
storage manager
translates the query processor instructions into low-level file-system commands that modify or retrieve data.
uses indexes to quickly locate data.
transaction manager
ensures transactions are properly executed.
prevents conflicts between concurrent transactions.
restores the database to a consistent state in the event of a transaction or system failure.
writes log records before applying changes to the database.
Metadata
data about the database, such as column names and the number of rows in each table.
relational database
stores data in tables, columns, and rows, similar to a spreadsheet. All data in a column has the same format.
All data in a row represents a single object, such as a person, place, product, or activity.
are ideal for databases that require an accurate record of every transaction, such as banking, airline reservation systems, and student records.
MongoDB
a NoSQL database designed for storing, retrieving, and managing large amounts of unstructured or semi-structured data.
stores data in JSON-like documents, making it flexible and scalable.
Open source
SQL CREATE TABLE statement
creates a new table by specifying the table and column names.
Each column is assigned a data type that indicates the format of column values. Data types can be numeric, textual, or complex
INT
DECIMAL
VARCHAR
DATE
analysis phase
focuses on gathering, understanding, and documenting user requirements to ensure the system meets business needs.
specifies database requirements without regard to a specific database system.
Requirements are represented as entities, relationships, and attributes.
An entity is a person, place, activity, or thing.
A relationship is a link between entities, and an attribute is a descriptive property of an entity.
has many alternative names, such as conceptual design, entity-relationship modeling, and requirements definition.
Rectangles represent entities. Entity names appear at the top of rectangles.
logical design phase
converts entities, relationships, and attributes into tables, keys, and columns.
affects the query result
physical design phase
focuses on how data is stored, accessed, and optimized for performance on the hardware and software platform.
adds indexes and specifies how tables are organized on storage media.
affects query processing speed but never affects the query result.
Data independence
the ability to change the data storage structure or schema without affecting the applications or programs that use the data.
It ensures that data access remains unaffected by changes in how data is stored or organized.
Application Programming Interface (API)
a library of procedures or classes that links a host programming language to a database.
MySQL Command-Line Client
a text interface included in the MySQL Server download.
allows developers to connect to the database server, perform administrative functions, and execute SQL statements.
MySQL Workbench
is installed with MySQL Server and allows developers to execute SQL commands using an editor.
database model
a conceptual framework for database systems, with three parts:
Data structures that prescribe how data is organized.
Operations that manipulate data structures.
Rules that govern valid data.
set
an unordered collection of elements enclosed in braces.
Ex: {a, b, c} and {c, b, a} are the same, since sets are not ordered
tuple
an ordered collection of elements enclosed in parentheses.
Ex: (a, b, c) and (a, b, c)
Relational data structure
A table has a name, a fixed tuple of columns, and a varying set of rows.
A column has a name and a data type.
A row is an unnamed tuple of values. Each value corresponds to a column and belongs to the column's data type.
A data type is a named set of values, from which column values are drawn.
Synonyms
Table, File, Relation
Row, Record, Tuple
Column, Field, Attribute
Relational operations
“Select” - selects a subset of rows of a table
Join - allows for the combination of information from two or more tables
Union - selects all rows of two tables.
Aggregate - computes functions over multiple table rows, such as sum and count.
Relational rules
Rules are logical constraints that ensure data is valid
Unique primary key. All tables have a primary key column, or group of columns, in which values may not repeat.
Unique column names. Different columns of the same table have different names.
No duplicate rows. No two rows of the same table have identical values in all columns.
Business rules
are based on business policy and specific to a particular database.
Ex: All rows of the Employee table must have a valid entry in the DepartCode column.
Ex: PassportNumber values may not repeat in different Employee rows.
Structured Query Language (SQL)
a high-level computer language for storing, manipulating, and retrieving data.
is the standard language for relational databases, and is commonly supported in non-relational databases.
clause
groups SQL keywords like SELECT, FROM, and WHERE with table names like City, column names like Name, and conditions like Population > 100000.
Literals
Explicit values that are string, numeric, or binary
must be surrounded by single quotes or double quotes.
Binary values are represented with x'0' where the 0 is any hex value.
‘string’
“string"
keywords
Words with special meaning
SELECT
FROM
WHERE
identifiers
Objects from the database like tables, columns, etc.
City
Name
Population
Data Definition Language (DDL)
defines the structure of the database.
Data Query Language (DQL)
retrieves data from the database.
Data Manipulation Language (DML)
manipulates data stored in a database.
Data Control Language (DCL)
controls database user access.
Data Transaction Language (DTL)
manages database transactions, used to rollback database changes
Table
has a name, a fixed sequence of columns, and a varying set of rows.
must have at least one column but any number of rows. A table without rows is called an empty table.
No row order. Rows are not ordered. The organization of rows on a storage device, such as a disk drive, never affects query results.
ALTER TABLE statement
adds, deletes, or modifies columns on an existing table.
ADD - adds a column
CHANGE - modifies a column
TINYINT
1 byte (8 bits)
Signed range: -128 to 127
Unsigned range: 0 to 255
SMALLINT
2 bytes (16 bits)
signed range: -32,768 to 32,767
unsigned range: 0 to 65,535
MEDIUMINT
3 byes (24 bits)
unsigned range: -8,388,608 to 8,388,607
signed range: 0 to 16,777,215
INTEGER or INT
4 bytes (32 bits)
signed range: -2,147,483,648 to 2,147,483,647
unsigned range: 0 to 4,294,967,295
% (modulo)
Divides one numeric value by another and returns the integer remainder
5 % 2 = 1
^
Raises one numeric value to the power of another
5^2 = 25
INSERT statement
The INSERT statement adds rows to a table. The INSERT statement has two clauses:
The INSERT INTO clause names the table and columns where data is to be added. The keyword INTO is optional
The VALUES clause specifies the column values to be added.
UPDATE statement
The UPDATE statement modifies existing rows in a table.
The UPDATE statement uses the SET clause to specify the new column values.
An optional WHERE clause specifies which rows are updated.
DELETE statement
The DELETE statement deletes existing rows in a table.
The FROM keyword is followed by the table name whose rows are to be deleted.
An optional WHERE clause specifies which rows should be deleted.
Omitting the WHERE clause results in all rows in the table being deleted.
TRUNCATE statement
deletes all rows from a table.
MERGE statement
selects data from one table, called the source, and inserts the data to another table, called the target.
auto-increment column
is a numeric column that is assigned an automatically incrementing value when a new row is inserted.
MySQL allows insertion of a specific value to an auto-increment column. However, overriding auto-increment for a primary key is usually a mistake.
foreign key
obey a relational rule called referential integrity. Referential integrity requires foreign key values must either be NULL or match some value of the referenced primary key.
When a foreign key constraint is specified, the database rejects insert, update, and delete statements that violate referential integrity.
RESTRICT rejects an insert, update, or delete that violates referential integrity.
CASCADE propagates primary key changes to foreign keys.
constraint
a rule that governs allowable values in a database. Constraints are based on relational and business rules
The UNIQUE constraint ensures that values in a column, or group of columns, are unique.
The CHECK constraint specifies an expression on one or more columns of a table.
Constraints are added and dropped with the ALTER TABLE TableName followed by an ADD, DROP, or CHANGE clause.
LIKE operator
when used in a WHERE clause, matches text against a pattern using the two wildcard characters % and _.
% matches any number of characters
Ex: LIKE 'L%t' matches "Lt", "Lot", "Lift", and "Lol cat".
_ matches exactly one character. Ex: LIKE 'Lt' matches "Lot" and "Lit" but not "Lt" and "Loot".
ORDER BY clause
orders selected rows by one or more columns in ascending (alphabetic or increasing) order.
The DESC keyword with the ORDER BY clause orders rows in descending order.
ABS(n)
Returns the absolute value of n
SELECT ABS(-5); will return 5.
LOWER(s)
Returns the lowercase s
SELECT LOWER('MySQL');
returns 'mysql'
TRIM(s)
TRIM(s) Returns the string s without leading and trailing spaces
SELECT TRIM(' test ');
returns 'test’
aggregate function
processes values from a set of rows and returns a summary value.
Common aggregate functions are:
COUNT() counts the number of rows in the set.
MIN() finds the minimum value in the set.
MAX() finds the maximum value in the set.
SUM() sums all the values in the set.
AVG() computes the arithmetic mean of all the values in the set.
HAVING clause
The HAVING clause is used with the GROUP BY clause to filter group results.
join
is a SELECT statement that combines data from two tables, known as the left table and right table, into a single result.
The tables are combined by comparing columns from the left and right tables, usually with the = operator.
The columns must have comparable data types.
INNER JOIN
selects only matching left and right table rows
FULL JOIN
selects all left and right table rows, regardless of match.
LEFT JOIN
selects all left table rows, but only matching right table rows.
RIGHT JOIN
selects all right table rows, but only matching left table rows.
outer join
any join that selects unmatched rows, including left, right, and full joins.
UNION
combines the two results into one table.
equijoin
combines rows from two tables where the values in the specified columns are equal.
compares columns of two tables with the = operator.
Most joins are equijoins.
A non-equijoin compares columns with an operator other than =, such as < and >.
cross-join
combines two tables without comparing columns.
A cross-join uses a CROSS JOIN clause without an ON clause.
As a result, all possible combinations of rows from both tables appear in the result.
subquery
sometimes called a nested query or inner query, is a query within another SQL query.
materialized view
is a view for which data is stored at all times.
Whenever a base table changes, the corresponding view tables can also change, so materialized views must be refreshed.
WITH CHECK OPTION
the database rejects inserts and updates that do not satisfy the view query WHERE clause.
entity-relationship model
a high-level representation of data requirements, ignoring implementation details.
includes three kinds of objects:
Entity - person, place, product, concept, or activity
Relationship - a statement about two entities
Attribute - a descriptive property of an entity
reflexive relationship
the relationship is between two instances of the same entity.
entity-relationship diagram
a schematic picture of entities, relationships, and attributes.
Entities are drawn as rectangles.
Relationships are drawn as lines connecting rectangles.
Attributes appear as additional text within an entity rectangle, under the entity name.
entity type
is a set of things. Ex: All employees in a company.
relationship type
a set of related things. Ex: Employee-Manages-Department is a set of (employee, department) pairs, where the employee manages the department.
attribute type
is a set of values. Ex: All employee salaries
entity instance
is an individual thing. Ex: The employee Sam Snead.
relationship instance
is a statement about entity instances. Ex: "Maria Rodriguez manages Sales."
attribute instance
is an individual value. Ex: The salary $35,000.
Analysis steps
Discover entities, relationships, and attributes
Determine cardinality
Distinguish strong and weak entities
Create supertype and subtype entities
Logical design steps
Implement entities
Implement relationships
Implement attributes
Apply normal form
cardinality
refers to maxima and minima of relationships and attributes.
Attribute Minimum – The least number of values an attribute can have.
Attribute Maximum – The most number of values an attribute can have.
Phone_Number (1,3)
1 → Minimum value (must have at least one phone number).
3 → Maximum value (can have up to three phone numbers).
Relationship maximum
the greatest number of instances of one entity that can relate to a single instance of another entity.
A relationship has two maxima, one for each of the related entities.
Maxima are usually specified as one or many.
A related entity is singular when the maximum is one and plural when the maximum is many.
Relationship minimum
the least number of instances of one entity that can relate to a single instance of another entity.
A relationship has two minima, one for each of the related entities.
Minima are usually specified as zero or one.
A related entity is optional when the minimum is zero and required when the minimum is one.
appears in parentheses.
One-to-One (1:1)
each record in Table A is related to at most one record in Table B, and vice versa.
Example: A person and their passport; each person has only one passport, and each passport is issued to only one person.
One-to-Many (1:N)
one record in Table A can be related to multiple records in Table B, but each record in Table B is related to only one record in Table A.
Example: A department and its employees; one department can have many employees, but each employee belongs to only one department.
Many-to-One (N:1)
multiple records in Table A can be related to one record in Table B.
Example: Many employees can work in the same department, but each employee belongs to exactly one department.
Many-to-Many (M:N)
multiple records in Table A can be related to multiple records in Table B and vice versa.
Example: Students and courses; each student can enroll in multiple courses, and each course can have multiple students.
Supertype Entity
identifies its subtype entities. The identifying relationship is called an IsA relationship.
Example:
A Person
supertype might include common attributes such as name
, address
, and phone number
, which are shared by its subtypes like Employee
and Customer
.
A partition of a supertype entity is a group of mutually exclusive subtype entities.
Subtype Entity
a subset of another entity type, called the supertype entity.
Example: In the previous example, the Employee
and Customer
subtypes of Person
would inherit name
, address
, and phone number
from Person
, but each would also have specialized attributes such as employee_id
for Employee
or customer_since
for Customer
.
On ER diagrams, subtype entities are drawn within the supertype.
strong entity
has one or more identifying attributes.
weak entity
does not have an identifying attribute. Instead, a weak entity usually has a relationship, called an identifying relationship, to another entity, called an identifying entity. Cardinality of the identifying entity is 1(1).
crow's foot notation,
depicts cardinality as a circle (zero), a short line (one), or three short lines (many).
intangible entity
is documented in the data model, but not tracked with data in the database.
artificial key
a single-column primary key created by the database designer when no suitable single-column or composite primary key exists.
Usually artificial key values are integers, generated automatically by the database as new rows are inserted to the table.
are stable, simple, and meaningless.
functional dependence
dependence of one column to another
Normal forms
rules for designing tables with less redundancy.
Trivial dependencies
When the columns of A are a subset of the columns of B, A always depends on B.
Normalization
a process in database design that eliminates redundancy and improves data integrity by organizing data into a series of normal forms (NF).
Each normal form builds upon the previous one by addressing specific anomalies.
eliminates redundancy by decomposing a table into two or more tables in higher normal form.
Boyce-Codd normal form
is ideal for tables with frequent inserts, updates, and deletes.
Denormalization
means intentionally introducing redundancy by merging tables.