Chapter 2_Data Modelling and Design (Part 3)

0.0(0)
studied byStudied by 11 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/58

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

59 Terms

1
New cards

Diagram,

Definitions,

Issues and outstanding questions,

Lineage

Plan for Data Modelling (4)

2
New cards

Diagram

  • A data model contains one or more _____.

  • is the visual that captures the requirements in a precise form​

3
New cards

Definitions

_____ for entities, attributes, and relationships are essential to maintaining the precision on a data model. ​

4
New cards

Issues and outstanding questions

Frequently the data modeling process raises issues and questions that may not be addressed during the data modeling phase. ​

5
New cards

Lineage

For physical and sometimes logical data models, it is important to know the data _____, that is, where the data comes from.​

6
New cards

data modeler,

accuracy

Two reasons why lineage is important to capture during the data modeling.

  1. the _____ will obtain a very strong understanding of the data requirements and therefore is in the best position to determine the source attributes.

  2. determining the source attributes can be an effective tool to validate the _____ of the model and the mapping (i.e., a reality check)

7
New cards

analysis and modeling work

  • To build the models, modelers often rely heavily on previous ___.

  • They may study existing data models and databases, refer to published standards, and incorporate any data requirements. 

8
New cards

Forward engineering

is the process of building a new application beginning with the requirements

9
New cards

CDM

is completed first to understand the scope of the initiative and the key terminology within that scope

10
New cards

Conceptual Data Modelling

CDM stands for _____

11
New cards

LDM

Then the ___ is completed to document the business solution

12
New cards

Logical Data Modelling

LDM stands for _____

13
New cards

PDM

followed by the _____ to document the technical solution

14
New cards

Select Scheme,

Select Notation,

Complete Initial CDM,

Incorporate Enterprise Terminology,

Obtain Sign-off

Creating the CDM involves the following steps:​ (5)

15
New cards

Complete Initial CDM

  • Collect the highest-level concepts (nouns) that exist for the organization.

  • Then collect the activities (verbs) that connect these concepts.

  • Relationships can go both ways, or involve more than two concepts

16
New cards

Analyze Information Requirements,

Analyze Existing Documentation​,

Add Associative Entities,

Add Attributes​,

Assign Domains,

Assign Keys ​

Logical Modelling consists of the following (6)

17
New cards

Analyze Information Requirements

  • Business processes require information products that are themselves the output from other business processes.

  • The names of these information products often identify an essential business vocabulary that serves as the basis for data modeling.

18
New cards

Analyze Existing Documentation

It can often be a great jump-start to use pre-existing data artifacts, including already built data models and databases

19
New cards

Associative entities

  • are used to describe Many-to-Many (or Many-to-Many-to-Many, etc.) relationships.

  • take the identifying attributes from the entities involved in the relationship and puts them into a new entity that just describes the relationship between the entities.

20
New cards

Add Attributes

  • An attribute in a logical data model should be atomic.

  • It should contain one and only one piece of data (fact) that cannot be divided into smaller pieces.

21
New cards

atomic

  • An attribute in a logical data model should be _____.

    • It should contain one and only one piece of data (fact) that cannot be divided into smaller pieces.

22
New cards

Assign Domains

Allow for consistency in format and value sets within and across projects.

23
New cards

key attribute

helps identify one unique entity instance from all others, either fully (by itself) or partially (in combination with other key elements)

24
New cards

Non-key attributes

describe the entity instance but do not help uniquely identify it.

25
New cards

Subtype absorption,

Supertype partition

Logical abstraction entities (supertypes and subtypes) become separate objects in the physical database design using one of two methods.

26
New cards

Add Attribute Details

Add details to the physical model, such as the

  • technical name of each table and column (relational databases),

  • or file and field (non-relational databases),

  • or schema and element (XML databases).

27
New cards

Add Reference Data Objects

Small reference data value sets in the logical data model can be implemented in a physical model in three common ways:

  • Create a matching separate code table​

  • Create a master shared code table​

  • Embed rules or valid codes into the appropriate object’s definition

28
New cards

Assign Surrogate Keys

  • Assign unique key values that are not visible to the business and have no meaning or relationship with the data with which they are matched.

  • This is an optional step and depends primarily on whether the natural key is large and composite and whose attributes are assigned values that could change over time.

29
New cards

Denormalizing or adding redundancy

can improve performance so much that it outweighs the cost of the duplicate storage and synchronization processing.

30
New cards

index

is an alternate path for accessing data in the database to optimize query (data retrieval) performance

31
New cards

Partition for Performance

Great consideration must be given to the partitioning strategy of the overall data model (dimensional), especially when facts contain many optional dimensional keys (sparse).

32
New cards

Views

can be used to control access to certain data elements or to embed common join conditions or filters to standardize common objects or queries.

33
New cards

Reverse engineering

is the process of documenting an existing database.

34
New cards

PDM

is completed first to understand the technical design of an existing system

35
New cards

LDM

to document the business solution that the existing system meets

36
New cards

CDM

to document the scope and key terminology within the existing system.

37
New cards

Review the Data Models

  • Models require quality control.

  • Continuous improvement practices should be employed.

  • Techniques such as time-to-value, support costs, and data model quality validators such as the Data Model Scorecard® (Hoberman, 2009) can all be used to evaluate the model for correctness, completeness, and consistency.

38
New cards

Maintain the Data Models

  • Updates to the data model need to be made when requirements change and frequently when business processes change.

  • Within a specific project, often when one model level needs to change, a corresponding higher level of model needs to change.

  • Many data modeling tools help automate this process of comparing physical with logical.

39
New cards

Data modeling tools

are software that automate many of the tasks the data modeler performs.

40
New cards

Entry-level data modeling tools

provide basic drawing functionality, including a data modeling palette, so that the user can easily create entities and relationships.

41
New cards

database data definition language (DDL).

More sophisticated data modeling tools support forward engineering from conceptual to logical to physical to database structures, allowing the generation of _____

42
New cards

Lineage Tools

  • is software that allows the capture and maintenance of the source structures for each attribute on the data model.

  • These tools enable impact analysis; that is, one can use them to see if a change in one system or part of system has effects in another system.

43
New cards

Data Profiling Tool

can help explore the data content, validate it against existing metadata, and identify data quality gaps/deficiencies, as well as deficiencies in existing data artifacts, such as logical and physical models, DDL, and model descriptions.

44
New cards

Metadata Repository

is a software tool that stores descriptive information about the data model, including the diagram and accompanying text such as definitions, along with Metadata imported other tools and processes

45
New cards

Data Model Patterns

  • are reusable modeling structures that can be applied to a wide class of situations.

  • There are elementary, assembly, and integration data model patterns.

  • Elementary patterns are the ‘nuts and bolts’ of data modeling.

46
New cards

Industry data models

  • are data models pre-built for an entire industry, such as healthcare, telecom, insurance, banking, or manufacturing.

  • These models are often both broad in scope and very detailed.

  • Some  contain thousands of entities and attributes.

  • can be purchased through vendors or obtained through industry groups such as ARTS (for retail), SID (for communications), or ACORD (for insurance).

47
New cards

ISO 11179 Metadata Registry

  • an international standard for representing Metadata in an organization

  • contains several sections related to data standards, including naming attributes and writing definitions.

48
New cards

Best Practices in Naming Conventions

  • Publish data model and database naming standards for each type of modeling object and database object.

  • Naming standards are particularly important for entities, tables, attributes, keys, views, and indexes.

  • Names should be unique and as descriptive as possible.

  • Naming standards should minimize name changes across environments.

  • Names should not reflect specific environment

49
New cards

Performance and ease of use,

Reusability,

Integrity,

Security,

Maintainability

Best Practices in Database Design (5)

50
New cards

Performance and ease of use

Ensure quick and easy access to data by approved users in a usable and business-relevant form, maximizing the business value of both applications and data.

51
New cards

Reusability

  • The database structure should ensure that, where appropriate, multiple applications can use the data and that the data can serve multiple purposes.

  • Avoid coupling a database, data structure, or data object to a single application.

52
New cards

Integrity

The data should always have a valid business meaning and value, regardless of context, and should always reflect a valid state of the business.

53
New cards

Security

True and accurate data should always be immediately available to authorized users, but only to authorized users

54
New cards

Maintainability

Perform all data work at a cost that yields value by ensuring that the cost of creating, storing, maintaining, using, and disposing of data does not exceed its value to the organization​

55
New cards

Data Model and Design Quality Management

  • Data analysts and designers act as intermediaries between information consumers (the people with business requirements for data) and the data producers who capture the data in usable form.

  • Data professionals must balance the data requirements of the information consumers and the application requirements of data producers. ​

  • Data models and database designs should be a reasonable balance between the short-term needs the long-term needs of the enterprise​

56
New cards

standard data modeling and database design deliverables,

standard names, acceptable abbreviations, and abbreviation rules for uncommon words,

standard naming formats,

standard methods,

metadata properties captured in data modeling and database design,

expectations and requirements,

data modeling tools,

preparing for and leading design reviews,

versioning,

discouraged

Data modeling and database design standards should include the following: (11)

  1. A list and description of _____

  2. A list of _____ that apply to all data model objects ​

  3. A list of ______ for all data model objects, including attribute and column words

  4. A list and description of _____ for creating and maintaining these deliverables

  5. A list description of data modeling and database design roles and responsibilities ​

  6. A list and description of all _____, including both business metadata and technical metadata.

  7. Metadata quality _____

  8. Guidelines for how to use _____

  9. Guidelines for _____

  10. Guidelines for _____​ of data models

  11. Practices that are _____

57
New cards

Review Data Model and Database Design Quality

  • Project teams should conduct requirements reviews and design reviews of the conceptual data model, logical data model, and physical database design.

  • The agenda for review meetings should include items for reviewing the starting model (if any), the changes made to the model and any other options that were considered and rejected, and how well the new model conforms to any modeling or architecture standards in place.

  • Conduct design reviews with a group of subject matter experts representing different backgrounds, skills, expectations, and opinions

  • In reviews where there is no approval, the modeler must rework the design to resolve the issues.

58
New cards

Manage Data Model Versioning and Integration

  • Data models and other design specifications require careful change control, just like requirements specifications and other SDLC deliverables.

  • Each change should note:

    1. Why the project or situation required the change

    2. What and How the object(s) changed, including which tables had columns added, modified, or removed, etc.

    3. When the change was approved and when the change was made to the model

    4. Who made the change

    5. Where the change was made

59
New cards

Data Model Scorecard

provides 11 data model quality metrics: one for each often categories that make up the Scorecard and an overall score across all ten categories