Database Systems

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/51

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

52 Terms

New cards

Selection

Select rows that satisfy a condition

New cards

Projection

Select specific columns from a relation.

New cards

Union

Combine results from two relations, removing duplicates

New cards

Difference

Return rows in one relation but not in another

New cards

Cartesian Product (×)

Combine every row of one relation with

every row of another

New cards

Rename

rename the relation or attributes

New cards

Normalization

evaluating and correcting table structures to minimize data

redundancies

• Reduces data anomalies

• Assigns attributes to tables based on determination

New cards

Denormalization

produces a lower normal form

• Results in increased performance and greater data redundancy

New cards

first normal form (1NF)

table format, no repeating groups, and PK identity

New cards

second normal form (2NF)

1NF and no partial dependencies

New cards

third normal form (3NF)

2NF and no transitive dependencies

New cards

Partial dependency

functional dependence in which the determinant is only

part of the primary key

• Assumption: one candidate key

• Straight forward

• Easy to identify

New cards

transitive dependency

attribute is dependent on another attribute that is not

part of the primary key

• More difficult to identify among a set of data

• Occur only when a functional dependence exists among nonprime attributes

New cards

functional dependence

The attribute B is fully functionally dependent on the attribute A

if each value of A determines one and only one value of B.

Example: PROJ_NUM S PROJ_NAME (read as PROJ_NUM

functionally determines PROJ_NAME)

In this case, the attribute PROJ_NUM is known as the

determinant attribute, and the attribute PROJ_NAME is known

as the dependent attribute

New cards

function dependence (generalized definition)

Attribute A determines attribute B (that is, B is functionally

dependent on A) if all (generalized definition) of the rows in the

table that agree in value for attribute A also agree in value for

attribute B.

New cards

functional dependence (composite key)

If attribute B is functionally dependent on a composite key A but

not on any subset of that composite key, the attribute B is fully

functionally dependent on A.

New cards

relational model

• store related data in tables

• require a schema which defines tables prior

to use

• encourage normalization to reduce data

redundancy

• support table JOINs to retrieve related data

from multiple tables in a single command

• implement data integrity rules

• provide transactions to guarantee two or

more updates succeed or fail as an atomic

unit

• can be scaled (with some effort)

• use a powerful declarative language for

querying

• offer plenty of support, expertise and tools.

New cards

document model

• store related data in JSON-like, name-value

documents

• can store data without specifying a schema

• must usually be denormalized so

information about an item is contained in a

single document

• should not require JOINs (presuming

denormalized documents are used)

• permit any data to be saved anywhere at

anytime without verification

• guarantee updates to a single document —

but not multiple documents

• provide excellent performance and

scalability

• use JSON data objects for querying

• are a newer, exciting technology.

New cards

Conceptual Design Steps

1 Data analysis and requirements

2 Entity relationship modeling and normalization

3 Data model verification

4 Distributed database design

New cards

Conceptual data model:

describes main data entities, attributes, relationships, and

constrains

- Designed as software and hardware independent

New cards

Minimum data rule

All that is needed is there, and all that is there is needed

New cards

Conceptual Data Model

highly abstract
easily understood
easily enhanced
only “entities” visible
abstract relationships
no software tool is required to define a conceptual data model

New cards

Data analysis and requirements

• Designers efforts are focused

- Information needs, users, sources and constitution

• Answers obtained from a variety of sources

- Developing and gathering end-user data views

- Directly observing current system: existing and desired output

- Interfacing with the systems design group

New cards

Entity relationship modeling and normalization

All objects (entities, attributes, relations, views, and so on) are defined in a data

dictionary, which is used in tandem with the normalization process

New cards

Developing the Conceptual Model Using ER Diagrams

1 Identify, analyze, and refine the business rules

2 Identify the main entities, using the results of Step 1

3 Define the relationships among the entities, using the results of Steps 1 and 2

4 Define the attributes, primary keys, and foreign keys for each of the entities

5 Normalize the entities (remember that entities are implemented as tables in an RDBMS)

6 Complete the initial ER diagram

7 Validate the ER model against the end users’ information and processing requirements

8 Modify the ER model, using the results of Step 7

New cards

Data model verification

• Verified against proposed system processes

• Run through a series of tests

New cards

Module

information system component that handles specific business function

New cards

cohesivity

strength of the relationships among the module’s entities

New cards

module coupling

strength of the relationships among the module’s entities

low coupling decreases unnecessary intermodule dependencies

New cards

The ER Model verification Process

1 Identify the ER model’s central entity

2 Identify each module and its components

3 Identify each module’s transaction requirements:

• Internal: updates/inserts/deletes/queries/reports

• External: module interfaces

4 Verify all processes against the module’s processing and reporting

requirements

5 Make all necessary changes suggested in Step 4

6 Repeat Steps 2–5 for all modules

New cards

Distributed Database design

• Portions of database may reside in different physical locations

• Database fragment: subset of a database stored at a given location

• Ensures database integrity, security, and performance

New cards

Validates logical model

• Using normalization

• Integrity constraints

• Against user requirements

New cards

Logical Design Steps

1 Map the conceptual model to logical model components

2 Validate the logical model using normalization

3 Validate the logical model integrity constraints

4 Validate the logical model against user requirements

New cards

Logical Data Model

presence of attributes for each entity
key attributes
non key attributes
primary key - foreign key relationships
user friendly attribute names
more detailed than conceptual model
database agnostic
bit more effort required to enhance, in comparison to conceptual model
data modeling tools like ERWin or PowerDesigner can be used to create Logical Data Models. This can be automatically converted to a Physical Data Model with the help of these tools.

New cards

Physical Data Model

entities referred to as Tables
attributes referred to as Columns
Database compatible table names
Database compatible columns names
Database specific data types
Difficult for users to understand
Significantly more effort required to enhance in comparison to Logical Model
Will include indexes, constraints, triggers & other DB objects
Difficult to port to a different database, once design is finalized
Tools like ERWin and PowerDesigner can help in automatically porting over the Logical Data Model to Physical Data Models of different versions.

New cards

Mapping the Conceptual Model to the Relational Model

1 Map strong entities

2 Map supertype/subtype relationships

3 Map weak entities

4 Map binary relationships

5 Map higher-degree relationships

New cards

Process of data storage organization and data access characteristics of the

database; ensures integrity, security, and performance

• Define data storage organization

• Define integrity and security measures

• Determine performance measures

New cards

Top-down design

starts by identifying the data sets and then defines the data elements for each of those sets

• Involves the identification of different entity types and the definition of each entity’s attributes

New cards

Bottom-up design

first identifies the data elements (items) and then groups them together in data sets

• First defines attributes, and then groups them to form entities

New cards

Centralized design

process by which all database design decisions are carried out centrally by a small group of people

• Suitable in a top-down design approach when the problem domain is relatively small, as in a single unit or department in an organization

New cards

Decentralized design

process in which conceptual design models subsets of an

organization’s database requirements, which are then aggregated into a complete design

• Such modular designs are typical of complex systems with a relatively large number of objects and procedures

New cards

Distributed database management system (DDBMS)

– Governs storage and processing of logically related data

– Interconnected computer systems

– Both data and processing functions are distributed among several sites

• Centralized database required that corporate data be stored in a single central site

New cards

DDBMS Advantages and Disadvantages

Advantages:

– Data are located near “greatest demand” site

– Faster data access

– Faster data processing

– Growth facilitation

– Improved communications

– Reduced operating costs

– User-friendly interface

– Less danger of a single-point failure

– Processor independence

Disadvantages:

– Complexity of management and control

– Security

– Lack of standards

– Increased storage requirements

– Increased training cost

New cards

Distributed processing

– Database’s logical processing is shared among two or more physically independent sites

– Connected through a network

New cards

Distributed Database

– Stores logically related database over two or more physically independent sites

– Database composed of database fragments

New cards

Characteristics of Distributed Management Systems

• Application interface

• Validation

• Transformation

• Query optimization

• Mapping

• I/O interface
• Formatting

• Security

• Backup and recovery

• DB administration

• Concurrency control

• Transaction management

• Must perform all the functions of centralized DBMS

• Must handle all necessary functions imposed by distribution of data and processing

– Must perform these additional functions transparently to the end user

New cards

DDBMS Components

• Must include (at least) the following

components:

– Computer workstations

– Network hardware and software

– Communications media

– Transaction processor (application

processor, transaction manager)

• Software component found in each computer that

requests data

– Data processor or data manager

• Software component residing on each computer that stores and retrieves data located at the site

• May be a centralized DBMS

New cards

Levels of Data and Process Distribution

Current systems classified by how process distribution and data distribution supported

New cards

Single-Site Processing, Single-Site Data (SPSD)

• All processing is done on single CPU or host computer (mainframe, midrange, or PC)

• All data are stored on host computer’s local disk

• Processing cannot be done on end user’s side of system

• Typical of most mainframe and midrange computer DBMSs

• DBMS is located on host computer, which is accessed by dumb terminals connected to it

New cards

Multiple-Site Processing, Single-Site Data (MPSD)

• Multiple processes run on different computers sharing single data repository

• MPSD scenario requires network file server running conventional applications

– Accessed through LAN

• Many multiuser accounting applications, running under personal computer network

New cards

Multiple-Site Processing, Multiple-Site Data (MPMD)

• Fully distributed database management system

• Support for multiple data processors and transaction processors at multiple sites

• Classified as either homogeneous or heterogeneous

• Homogeneous DDBMSs

– Integrate only one type of centralized DBMS over a network

• Heterogeneous DDBMSs

– Integrate different types of centralized DBMSs over a network

• Fully heterogeneous DDBMSs

– Support different DBMSs

– Support different data models (relational, hierarchical, or network)

– Different computer systems, such as mainframes and microcomputers

New cards

Distributed Database Transparency Features

• Allow end user to feel like database’s only user

• Features include:

– Distribution transparency

– Transaction transparency

– Failure transparency

– Performance transparency

– Heterogeneity transparency

• Allows management of physically dispersed

database as if centralized

• Three levels of distribution transparency:

– Fragmentation transparency

– Location transparency

– Local mapping transparency

• Ensures database transactions will maintain distributed database’s integrity and consistency

• Ensures transaction completed only when all database sites involved complete their part

• Distributed database systems require complex mechanisms to manage transactions

– To ensure consistency and integrity