AA

Module 6 - Data Modeling

Data Modeling Overview

  • Definition:

    • A technique for modeling data determining what data and relationships should be stored in a database.

    • Graphical representation of a database, effectively communicating a database design.

  • Goals:

    • Identify the facts to be stored in the database.

    • Requires collaboration between client and analyst.

  • Process:

    • Iterative, involving trial and revision.

    • The data model serves as a working document.

Building Blocks of Data Modeling

  • Entity:

    • A thing about which data is stored; the basic building block.

  • Attribute:

    • Describes an entity; singular and unique within the model.

  • Relationship:

    • Describes the linkage between two entities, defined by relationship descriptors:

      • 1:1 (one-to-one)

      • 1:m (one-to-many)

      • m:m (many-to-many)

  • Identifier:

    • Uniquely distinguishes an instance of an entity.

Quality of Data Models

  • Well-formed Data Model:

    • All construction rules are obeyed.

    • No ambiguity is present.

    • All entities, attributes, and relationships are defined.

    • Names are meaningful and understood by the client.

  • High Fidelity Image:

    • Accurately describes the real-world it represents.

    • Relationships are correctly established.

    • Completeness and understandability are key.

    • The model must make sense to the client.

Quality Improvement Considerations

  • Assess:

    • Level of detail of the model.

    • All exceptions must be handled.

    • Overall accuracy of the model.

Modality & Cardinality

  • Modality:

    • Also known as optionality; defines the minimum number of instances in a relationship.

  • Cardinality:

    • Indicates the range of instances in a relationship.

    • Links with modality.

  • Cardinality Examples:

    • 0,1: Optional (zero or one instances)

    • 0,n: Zero or many instances

    • 1,1: Mandatory (exactly one instance)

Types of Entities

  • Independent:

    • Can stand alone; often prominent in the client's mind.

  • Dependent:

    • Relies on another entity for existence and identification.

    • Can become independent with an arbitrary identifier.

  • Associative:

    • Results from m:m relationships; holds current or historical data.

  • Aggregate:

    • Formed from multiple entities with a common prefix or suffix.

  • Subordinate:

    • Entity data that varies among instances.

Generalization

  • Describes a relationship between more general and more specific elements.

    • For each subtype, the primary key must represent the supertype's key as a foreign key.

UML Aggregation Concepts

  • Aggregation:

    • A part-whole relationship.

    • Shared Aggregation:

      • Enables multiple entities to own the same entity.

    • Composite Aggregation:

      • One entity exclusively owns the other.

Hints on Data Modeling

  • The model may expand and contract; include identifiers when necessary.

  • Ensure identifiers have a singular purpose—identification.

  • Create attributes when instance ordering is required.

  • Choose names carefully; be aware of synonyms and homonyms.

  • Maintain clarity by labeling relationships and ensuring data model accuracy.

Meaningful vs Non-Meaningful Identifiers

  • Meaningful Identifiers:

    • Draw inferences about entity attributes; can be recognizable.

    • Advantages: Easy to remember;

    • Disadvantages: Can lead to identifier exhaustion as realities change.

  • Non-Meaningful Identifiers:

    • Serve the sole purpose of uniquely identifying the entity; avoid data management issues.

    • Attributes describe characteristics of the entity.

Key Takeaways

  • A high-fidelity data model handles all exceptions.

  • Identifiers only need to distinguish an instance.

  • Data modeling skills develop over time.