Data Management Foundations

0.0(0)
studied byStudied by 1 person
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/74

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

75 Terms

1
New cards

Data can vary by

scope, format, access

2
New cards

3 software tools to manage database (and what they do):

  • database management system (reads and writes data)

  • query (retrieves)

  • database application (for user interaction)

3
New cards

Requirements for large databases (5):

  • Performance 

  • Authorization  (managed by db admin)

  • Security 

  • Rules  

  • Recovery 

4
New cards

what is a transaction

a group of queries that must either be completed or rejected as a whole

5
New cards

3 rules for processing transactions

  • Ensure transactions are processed completely or not at all 

  • Prevent conflicts between concurrent transactions 

  • Ensure transactions results are never lost 

6
New cards

define the term architecture

describes the internal components and relationships between them

7
New cards

List and define the 3 main architecture components of databases (varies in real life)

  • Query processor interprets queries, creates a plan to modify the database or retrieve data, and returns query results to the application. Performs query optimization to ensure the most efficient instructions are executed on the data.

  • Storage manager translates the query processor instructions into low-level file-system commands that modify or retrieve data. Uses indexes to quickly locate data. (access to data must go through SM!)

  • Transaction manager ensures transactions are properly executed. Prevents conflicts between concurrent transactions and restores the database to a consistent state in the event of a transaction or system failure.

8
New cards

What is a catalog? What is a log?

  • Catalog - directory of tables, columns, indexes, and other objects. Describes the database, “data dictionary”

  • Log - a files consisting a complete record of all inserts, updates, and deletes. written by the transaction manager

9
New cards

What is a relational database?

stores data in tables, columns, and rows, similar to a spreadsheet

All relational databases support the Structured Query Language (SQL)  

10
New cards

What is NoSQL? Provide an example

newer non-relational systems that are optimized for big data

example: MongoDB - NoSQL- Open source - 5

11
New cards

What are the 3 phases for database design? Include diagram types for each

  1. Analysis phase specifics database requirements (as entities, relationships, and attributes ) without regard to a specific database system 

    • ER Diagram

  2. Logical design - This phase implements database requirements in a specific database system. Converts entities, relationships, and attributes into tables, keys, and columns

    • Table Diagram

  3. Physical design - This phase add indexes and specifies how tables are organized on storage media 

    • No diagram, specified with CREATE INDEX

NOTE: logical affects query results, but physical only affects processing speed. (data independence)

12
New cards

What is an API and how is it used in database programming

  • Since db programs are written with SQL and other object-oriented languages (like C++, Java, Python), an API is needed

  • API – a library of procedures or classes that links a host programming language to a database 

  • Host language calls library procedures, which do things like connect to database, execute queries, and returning results

13
New cards

Who sponsors MySQL

Oracle

14
New cards

What is the MySQL Command-Line Client

MySQL Command-Line Client – a text interface included in the MySQL Server download.  It allows developers to connect to the database server, perform admin functions, and execute SQL statements 

15
New cards

What is a database model (and its 3 parts)? What is the most common model?

  • Database model – a conceptual framework for database systems, with 3 parts : 

    • Data structures – prescribe how data is organized 

    • Operations – manipulate data structures 

    • Rules – govern valid data 

  • Relational model – a database model based on tabular structure  

    • Initially designed for transactional data, but has improved their support for big data 

16
New cards

Explain the difference between a set and a tuple

  • Set – an unordered collection of elements enclosed in braces 

    • Ie {a,b,c} is the same as {c,b,a} 

  • Tuple – an ordered collect of elements enclosed in braces  

    • Ie (a,b,c) and (c,b,a) are different 

17
New cards

List synonyms for Table, Column, and Row

  • Table, File, Relation

  • Row, Record, Tuple

  • Column, Field, Attribute

18
New cards

In SQL how do you insert a comment?

  • - - for single line comment

  • /* for multi- line */

19
New cards

What are the 5 SQL sub-languages?

  • Data Definition Language (DDL) defines the structure of the database. 

    • Creates, alters, and drops tables (ex CREATE) 

  • Data Query Language (DQL) retrieves data from the database. 

    • Selects data from a table (ex SELECT) 

  • Data Manipulation Language (DML) manipulates data stored in a database. 

    • Inserts, updates, and deletes data in a table (ex INSERT) 

  • Data Control Language (DCL) controls database user access. 

    • Grants and revokes permissions to and from users (ex GRANT) 

  • Data Transaction Language (DTL) manages database transactions. 

    • Commits data to database, rolls back data from db, and creates savepoints (ex COMMIT) 

20
New cards

What does SHOW do? Syntax?

  • SHOW provides users and admin with info about db, its contents, and server status 

    • You can SHOW DATABASES, columns, tables, show create table, etc

  • first need to use USE to select database

21
New cards

What are the 4 common table rules?

  1. Exactly one value per cell 

  2. No duplicate column names 

  3. No duplicate rows (often allowed for temp tables)

  4. No row order 

22
New cards

What is data independence? Why is it beneficial?

  • Data independence – row order does not affect query results 

  • This allows users/admin to improve query performance by changing organization of data on storage devices, without affecting query results 

23
New cards

List 3 ways to alter a table

  • ALTER TABLE tablename

    ADD columnname data type;

  • ALTER TABLE tablename

    CHANGE currentcol newcol newdatatype;

  • ALTER TABLE tablename

    DROP colname;

24
New cards

Integer data type ranges

  • TINYINT 1 byte

    • Signed range: -128 to 127

    • Unsigned range: 0 to 255

  • SMALLINT 2 bytes

    • Signed range: -32,768 to 32,767

    • Unsigned range: 0 to 65,535

  • MEDIUMINT 3 bytes

    • Signed range: -8,388,608 to 8,388,607

    • Unsigned range: 0 to 16,777,215

  • INTEGER or INT 4 bytes

    • Signed range: -2,147,483,648 to 2,147,483,647

    • Unsigned range: 0 to 4,294,967,295

  • BIGINT 8 bytes

    • Signed range: -2^63 to 2^63 -1

    • Unsigned range: 0 to 2^64 -1

25
New cards

Syntax for inserting rows into table

  • INSERT [INTO] tablename (columns,…)

    VALUES (values, …);

26
New cards

Syntax for Updating rows

  • UPDATE tablename

    SET col1=value1, col2=val2,…

    WHERE condition;

27
New cards

Syntax for deleting rows

  • DELETE FROM tablename

    WHERE condition;

28
New cards

What is a primary key, its two types, and its requirements? Column or table constraint?

  • PK - a column, or group of columns used to identify a row

  • types : single PK (one column) or composite PK (multiple columns)

  • Requirements:

    • unique

    • Not NULL

    • minimal (for composite)

  • Table constraint

29
New cards

What is a foreign key and its requirements? Syntax?

  • FK - a column or group of columns that refer to a primary key (must be same data type)

  • Requirements:

    • Referential integrity - foreign keys must either be NULL or match some value of the reference primary key 

  • (table constraint)

    FOREIGN KEY (FKcol) REFERENCES (PKcol)

30
New cards

What is referential integrity

all FK values must match primary or be fully NULL 

31
New cards

What are the 4 common constraints used for referential integrity violations

  • RESTRICT rejects an insert, update, or delete that violates referential integrity. 

  • SET NULL sets invalid foreign keys to NULL. 

  • SET DEFAULT sets invalid foreign keys to the foreign key default value. 

  • CASCADE propagates primary key changes to foreign keys. 

32
New cards

How do you update or delete foreign and primary keys?

ON UPDATE and ON DELETE clauses 

  • For foreign key updates and deletes, mySQL supports RESTRICT  

  • For primary, mySQL supports all (RESTRICT, SET NULL, SET DEFAULT, CASCADE)

  • Actions are specified in ON UPDATE and ON DELETE clauses of FORIEGN /PRIMARY KEY constraint  

33
New cards

What are the purpose of the UNIQUE and CHECK constraints?

  • UNIQUE ensures values in a column, or group of columns, are unique 

  • CHECK constraint that specifies an expression on one or more columns (either a col or table constraint) 

34
New cards

How do you add or drop constraints?

In ALTER TABLE, use ADD, DROP, or CHANGE

  • ALTER TABLE tablename

    DROP CHECK constraintname;

  • ALTER TABLE

    ADD CONSTRAINT constraintname PRIMARY KEY (column);

35
New cards

Does GROUP BY appear before or after ORDER BY

Before

Appears between WHERE and ORDER BY (if any)

36
New cards

What is the HAVING clause for?

Used with GROUP BY  to filter group results 

HAVING COUNT(*) > 1 : a having clause that selects only groups with more than one row count 

37
New cards

What is an INNER JOIN

  • INNER JOIN selects only matching left and right table rows.

  • INNER is optional.

appears between FROM and ON, followed by optional WHERE

38
New cards

What is ON used for in a join

specifies which columns to join on

39
New cards

What is a FULL JOIN

selects all left and right rows regardless of match

is not supported by mySQL

40
New cards

What is a LEFT JOIN? What is a RIGHT JOIN

LEFT JOIN selects all left table rows, but only matching right table rows.

RIGHT JOIN selects all right table rows, but only matching left table rows.  

41
New cards

Since MySQL doesn’t support FULL JOIN, what can be used instead?

UNION - combines the two results into one table

42
New cards

Whats the difference between an equijoin and a non-equijoin?

  • equijoin

    • An equijoin compares columns of two tables with the = operator.

  • non-equijoin

    • A non-equijoin compares columns with an operator other than =, such as < and >.

43
New cards

What’s a cross join?

A cross-join combines two tables without comparing columns, aka no ON clause

44
New cards

What is a materialized view?

  • A materialized view is a view for which view table data is stored at all times.

  • must be refreshed when data is changed

45
New cards

What is WITH CHECK OPTION used for?

used in CREATE VIEW, it rejects and generates error when inserts and updates do not satisfy the view query WHERE clause

46
New cards

What is the Entity-Relationship Model? What are the 3 objects included?

  • An entity-relationship model is a high-level representation of data requirements, ignoring implementation details.

  • Includes 3 kinds of objects: 

    • An entity is a person, place, product, concept, or activity. 

    • A relationship is a statement about two entities. 

    • An attribute is a descript­ive property of an entity. 


47
New cards

What are the two parts of an ER Model

  • ER Diagram (Entity Relationship Diagram) - schematic picture of entities, relationships, and attributes  

  • Glossary – also known as data dictionary or repository, documents additional details in text form  

48
New cards

What is a type? How are they usually implemented?

In entity-relationship modeling, a type is a set

  • An entity type is a set of things.  

  • A relationship type is a set of related things.  

  • An attribute type is a set of values.  

Entity, relationship, and attribute types usually become tables, foreign keys, and columns, respectively. 

49
New cards

What are instances? How are they implemented?

  • An instance is an element of a set

    • An entity instance is an individual thing.  

    • A relationship instance is a statement about entity instances.  

    • An attribute instance is an individual value.  

  • Entity, relationship, and attribute instances usually become rows, foreign key values, and column values, respectively 

50
New cards

What are the 3 phases of Database Design in relation to ER Models?

  1. Analysis develops an entity-relationship model, capturing data requirements while ignoring implementation details

  2. Logical design converts the entity-relationship model into tables, columns, and keys for a particular database system. 

    • Entities become tables, relationships become keys (sometimes tables), and attributes become columns 

  3. Physical design adds indexes and specifies how tables are organized on storage media

51
New cards

What are the first 4 steps in database design (analysis phase)

  1. discover entities, relationships, and attributes

  2. determine cardinality

  3. distinguish strong and weak entities

  4. create supertype and subtype entities

52
New cards

What are steps 5-9 of database design (logical)

  1. Implement entities

  2. Implement relationships

  3. Implement attributes

  4. Apply Normal form

53
New cards

What is cardinality?

Cardinality – refers to maxima and minima of relationships and attributes 

54
New cards

Explain cardinality syntax for relationships and attributes

  • Max (Min)

  • either in relationship line or by attribute

  • for mins - 1 is required, 0 optional

  • NOTE Unique notation appears before max ( U-Max(Min))

55
New cards

What is an identifying attribute?

An identifying attribute is unique, singular, and required. Identifying attribute values correspond one-to-one to, or identify, entity instances. 

example : ProjectNumber 1-1(1)

56
New cards

What is a strong entity?

A strong entity has one or more identifying attributes. When a strong entity is implemented as a table, one of the identifying attributes may become the primary key. 

57
New cards

What is a weak entity?

  • A weak entity does not have an identifying attribute. Instead, a weak entity usually has a relationship, called an identifying relationship, to another entity, called an identifying entity. Cardinality of the identifying entity is 1(1). 

  • A weak entity can also be identified by another (or multiple) weak entities 

58
New cards

What are Supertype and Subtype entities?

A subtype entity is a subset of another entity type, called the supertype entity 

A supertype entity identifies its subtype entities. The identifying relationship is called an IsA relationship 

59
New cards

What is Crow’s foot notation?

 crow's foot notation - depicts cardinality as a circle (zero), a short line (one), or three short lines (many) 

60
New cards

What are 3 best practices for selecting primary Keys?

  • stable: value should not change.

  • simple: easy to type and store.

  • meaningless: no descriptive information.

61
New cards

What is functional dependency?

Dependence of one column on another

  • Column A depends on Column B is denoted B->A (basically B is associated with 1 A) 

  • Each value of B relates to at most one value of A  

62
New cards

What is redundancy, when does it occur, and what design rules are used to lower redundancy?

  • Redundancy is the repetition of related values in a table. 

  • Redundancy occurs when a dependence is on a column that is not unique

  • Normal forms are rules for designing tables with less redundancy. 

63
New cards

What is first normal form (1NF)?

A table is in first normal form when each cell contains one value and the table has a primary key. This definition has two corollaries: 

  • In a first normal form table, every non-key column depends on the primary key.

  • A first normal form table has no duplicate rows.

64
New cards

What is second normal form?

A table is in second normal form when all non-key columns depend on the whole primary key. In other words, a non-key column cannot depend on part of a composite primary key or a candidate key 

65
New cards

What is third normal form?

Redundancy can occur in a second normal form table when a non-key column depends on another non-key column. Informally, a table is in third normal form when all non-key columns depend on the key, the whole key, and nothing but the key 

66
New cards

What is Boyce-Codd Normal Form?

Every attribute should be dependent on the key, the whole key, and nothing but the key 

Boyce-Codd normal form eliminates all redundancy arising from functional dependence

67
New cards

What is Normalization and what are the 3 steps to normalize tables?

  • Normalization eliminates redundancy by decomposing a table into two or more tables in higher normal form 

    1. List all unique columns. U

    2. Identify dependencies on non-unique columns. Non-unique columns are either external to all unique columns or contained within a composite unique column. 

    3. Eliminate dependencies on non-unique columns by creating new tables

68
New cards
69
New cards
70
New cards
71
New cards
72
New cards
73
New cards
74
New cards
75
New cards