Database Systems

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/51

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

52 Terms

1
New cards

Selection

Select rows that satisfy a condition

2
New cards

Projection

Select specific columns from a relation.

3
New cards

Union

Combine results from two relations, removing duplicates

4
New cards

Difference

Return rows in one relation but not in another

5
New cards

Cartesian Product (×)

Combine every row of one relation with

every row of another

6
New cards

Rename

rename the relation or attributes

7
New cards

Normalization

evaluating and correcting table structures to minimize data

redundancies

• Reduces data anomalies

• Assigns attributes to tables based on determination

8
New cards

Denormalization

produces a lower normal form

• Results in increased performance and greater data redundancy

9
New cards

first normal form (1NF)

table format, no repeating groups, and PK identity

10
New cards

second normal form (2NF)

1NF and no partial dependencies

11
New cards

third normal form (3NF)

2NF and no transitive dependencies

12
New cards

Partial dependency

functional dependence in which the determinant is only

part of the primary key

• Assumption: one candidate key

• Straight forward

• Easy to identify

13
New cards

transitive dependency

attribute is dependent on another attribute that is not

part of the primary key

• More difficult to identify among a set of data

• Occur only when a functional dependence exists among nonprime attributes

14
New cards

functional dependence

The attribute B is fully functionally dependent on the attribute A

if each value of A determines one and only one value of B.

Example: PROJ_NUM S PROJ_NAME (read as PROJ_NUM

functionally determines PROJ_NAME)

In this case, the attribute PROJ_NUM is known as the

determinant attribute, and the attribute PROJ_NAME is known

as the dependent attribute

15
New cards

function dependence (generalized definition)

Attribute A determines attribute B (that is, B is functionally

dependent on A) if all (generalized definition) of the rows in the

table that agree in value for attribute A also agree in value for

attribute B.

16
New cards

functional dependence (composite key)

If attribute B is functionally dependent on a composite key A but

not on any subset of that composite key, the attribute B is fully

functionally dependent on A.

17
New cards

relational model

• store related data in tables

• require a schema which defines tables prior

to use

• encourage normalization to reduce data

redundancy

• support table JOINs to retrieve related data

from multiple tables in a single command

• implement data integrity rules

• provide transactions to guarantee two or

more updates succeed or fail as an atomic

unit

• can be scaled (with some effort)

• use a powerful declarative language for

querying

• offer plenty of support, expertise and tools.

18
New cards

document model

• store related data in JSON-like, name-value

documents

• can store data without specifying a schema

• must usually be denormalized so

information about an item is contained in a

single document

• should not require JOINs (presuming

denormalized documents are used)

• permit any data to be saved anywhere at

anytime without verification

• guarantee updates to a single document —

but not multiple documents

• provide excellent performance and

scalability

• use JSON data objects for querying

• are a newer, exciting technology.

19
New cards

Conceptual Design Steps

1 Data analysis and requirements

2 Entity relationship modeling and normalization

3 Data model verification

4 Distributed database design

20
New cards

Conceptual data model:

describes main data entities, attributes, relationships, and

constrains

- Designed as software and hardware independent

21
New cards

Minimum data rule

All that is needed is there, and all that is there is needed

22
New cards

Conceptual Data Model

  • highly abstract

  • easily understood

  • easily enhanced

  • only “entities” visible

  • abstract relationships

  • no software tool is required to define a conceptual data model

23
New cards

Data analysis and requirements

• Designers efforts are focused

- Information needs, users, sources and constitution

• Answers obtained from a variety of sources

- Developing and gathering end-user data views

- Directly observing current system: existing and desired output

- Interfacing with the systems design group

24
New cards

Entity relationship modeling and normalization

All objects (entities, attributes, relations, views, and so on) are defined in a data

dictionary, which is used in tandem with the normalization process

25
New cards

Developing the Conceptual Model Using ER Diagrams

1 Identify, analyze, and refine the business rules

2 Identify the main entities, using the results of Step 1

3 Define the relationships among the entities, using the results of Steps 1 and 2

4 Define the attributes, primary keys, and foreign keys for each of the entities

5 Normalize the entities (remember that entities are implemented as tables in an RDBMS)

6 Complete the initial ER diagram

7 Validate the ER model against the end users’ information and processing requirements

8 Modify the ER model, using the results of Step 7

26
New cards

Data model verification

• Verified against proposed system processes

• Run through a series of tests

27
New cards

Module

information system component that handles specific business function

28
New cards

cohesivity

strength of the relationships among the module’s entities

29
New cards

module coupling

strength of the relationships among the module’s entities

  • low coupling decreases unnecessary intermodule dependencies

30
New cards

The ER Model verification Process

1 Identify the ER model’s central entity

2 Identify each module and its components

3 Identify each module’s transaction requirements:

• Internal: updates/inserts/deletes/queries/reports

• External: module interfaces

4 Verify all processes against the module’s processing and reporting

requirements

5 Make all necessary changes suggested in Step 4

6 Repeat Steps 2–5 for all modules

31
New cards

Distributed Database design

• Portions of database may reside in different physical locations

• Database fragment: subset of a database stored at a given location

• Ensures database integrity, security, and performance

32
New cards

Validates logical model

• Using normalization

• Integrity constraints

• Against user requirements

33
New cards

Logical Design Steps

1 Map the conceptual model to logical model components

2 Validate the logical model using normalization

3 Validate the logical model integrity constraints

4 Validate the logical model against user requirements

34
New cards

Logical Data Model

  • presence of attributes for each entity

  • key attributes

  • non key attributes

  • primary key - foreign key relationships

  • user friendly attribute names

  • more detailed than conceptual model

  • database agnostic

  • bit more effort required to enhance, in comparison to conceptual model

  • data modeling tools like ERWin or PowerDesigner can be used to create Logical Data Models. This can be automatically converted to a Physical Data Model with the help of these tools.

35
New cards

Physical Data Model

  • entities referred to as Tables

  • attributes referred to as Columns

  • Database compatible table names

  • Database compatible columns names

  • Database specific data types

  • Difficult for users to understand

  • Significantly more effort required to enhance in comparison to Logical Model

  • Will include indexes, constraints, triggers & other DB objects

  • Difficult to port to a different database, once design is finalized

  • Tools like ERWin and PowerDesigner can help in automatically porting over the Logical Data Model to Physical Data Models of different versions.

36
New cards

Mapping the Conceptual Model to the Relational Model

1 Map strong entities

2 Map supertype/subtype relationships

3 Map weak entities

4 Map binary relationships

5 Map higher-degree relationships

37
New cards

Process of data storage organization and data access characteristics of the

database; ensures integrity, security, and performance

• Define data storage organization

• Define integrity and security measures

• Determine performance measures

38
New cards

Top-down design

starts by identifying the data sets and then defines the data elements for each of those sets

• Involves the identification of different entity types and the definition of each entity’s attributes

39
New cards

Bottom-up design

first identifies the data elements (items) and then groups them together in data sets

• First defines attributes, and then groups them to form entities

40
New cards

Centralized design

process by which all database design decisions are carried out centrally by a small group of people

• Suitable in a top-down design approach when the problem domain is relatively small, as in a single unit or department in an organization

41
New cards

Decentralized design

process in which conceptual design models subsets of an

organization’s database requirements, which are then aggregated into a complete design

• Such modular designs are typical of complex systems with a relatively large number of objects and procedures

42
New cards

Distributed database management system (DDBMS)

– Governs storage and processing of logically related data

– Interconnected computer systems

– Both data and processing functions are distributed among several sites

• Centralized database required that corporate data be stored in a single central site

43
New cards

DDBMS Advantages and Disadvantages

Advantages:

– Data are located near “greatest demand” site

– Faster data access

– Faster data processing

– Growth facilitation

– Improved communications

– Reduced operating costs

– User-friendly interface

– Less danger of a single-point failure

– Processor independence

Disadvantages:

– Complexity of management and control

– Security

– Lack of standards

– Increased storage requirements

– Increased training cost

44
New cards

Distributed processing

– Database’s logical processing is shared among two or more physically independent sites

– Connected through a network

45
New cards

Distributed Database

– Stores logically related database over two or more physically independent sites

– Database composed of database fragments

46
New cards

Characteristics of Distributed Management Systems

• Application interface

• Validation

• Transformation

• Query optimization

• Mapping

• I/O interface
• Formatting

• Security

• Backup and recovery

• DB administration

• Concurrency control

• Transaction management

• Must perform all the functions of centralized DBMS

• Must handle all necessary functions imposed by distribution of data and processing

– Must perform these additional functions transparently to the end user

47
New cards

DDBMS Components

• Must include (at least) the following

components:

– Computer workstations

– Network hardware and software

– Communications media

– Transaction processor (application

processor, transaction manager)

• Software component found in each computer that

requests data

– Data processor or data manager

• Software component residing on each computer that stores and retrieves data located at the site

• May be a centralized DBMS

48
New cards

Levels of Data and Process Distribution

Current systems classified by how process distribution and data distribution supported

49
New cards

Single-Site Processing, Single-Site Data (SPSD)

• All processing is done on single CPU or host computer (mainframe, midrange, or PC)

• All data are stored on host computer’s local disk

• Processing cannot be done on end user’s side of system

• Typical of most mainframe and midrange computer DBMSs

• DBMS is located on host computer, which is accessed by dumb terminals connected to it

50
New cards

Multiple-Site Processing, Single-Site Data (MPSD)

• Multiple processes run on different computers sharing single data repository

• MPSD scenario requires network file server running conventional applications

– Accessed through LAN

• Many multiuser accounting applications, running under personal computer network

51
New cards

Multiple-Site Processing, Multiple-Site Data (MPMD)

• Fully distributed database management system

• Support for multiple data processors and transaction processors at multiple sites

• Classified as either homogeneous or heterogeneous

• Homogeneous DDBMSs

– Integrate only one type of centralized DBMS over a network

• Heterogeneous DDBMSs

– Integrate different types of centralized DBMSs over a network

• Fully heterogeneous DDBMSs

– Support different DBMSs

– Support different data models (relational, hierarchical, or network)

– Different computer systems, such as mainframes and microcomputers

52
New cards

Distributed Database Transparency Features

• Allow end user to feel like database’s only user

• Features include:

– Distribution transparency

– Transaction transparency

– Failure transparency

– Performance transparency

– Heterogeneity transparency

• Allows management of physically dispersed

database as if centralized

• Three levels of distribution transparency:

– Fragmentation transparency

– Location transparency

– Local mapping transparency

• Ensures database transactions will maintain distributed database’s integrity and consistency

• Ensures transaction completed only when all database sites involved complete their part

• Distributed database systems require complex mechanisms to manage transactions

– To ensure consistency and integrity