Data Modeling Using the Entity Relationship Model

Database Design Process

  • Two main activities:
    • Database design.
    • Application design (programs and user interfaces).
  • Application design focuses on programs and user interfaces.
  • Database design focuses on data requirements.

Conceptual Database Design

  • Understanding of the mini-world is vital for conceptual design.
  • Mini-world: The entities and things about which data is stored.
  • Steps:
    • Requirements analysis: Understand data requirements, constraints, and usage.
    • Conceptual design: High-level understanding of entities and interactions.
    • Logical design: Mapping to a data model (e.g., relational data model).
    • Physical design: Actual database creation (tables, fields, etc.).
  • Application design and database design happen in parallel.

DBMS Independence

  • Activities until the logical design are DBMS independent.
  • From logical design onwards, the DBMS is considered.
  • Relational data model choices: PostgreSQL, Oracle Database, Microsoft SQL Server, etc.
  • Graph data model: Graph databases and graph data management systems.

ER Modeling

  • ER modeling is one way to create conceptual designs.
  • Example: Company database.
  • Requirements:
    • The company is organized into departments.
    • Each department has a name, a number, and a manager (employee).
    • The start date of the department manager is tracked.
    • Departments can have multiple locations.
    • Each department controls multiple projects.
    • Each project has a name, a number, and a single location.
    • Employee information is stored (SSN, address, salary, date of birth, etc.).
    • Each employee works for one department.
    • Employees can work on multiple projects.
    • The number of hours each employee works per week on each project is tracked.
    • The direct supervisor of each employee is tracked.
    • Employees may have dependents (name, sex, date of birth, relationship).

Entities and Attributes

  • Entities: Specific things or objects about which information is stored.
    • Examples: Employees, departments, projects.
  • Attributes: Properties for each of these entities.
    • Examples: Employee attributes: name, SSN, address, sex, date of birth.
  • Each attribute has a value set (data type).
    • Examples: Integer, string, date.

Types of Attributes

  • Simple attributes: Single atomic value (e.g., SSN).
  • Composite attributes: Multiple components (e.g., name: first name, middle name, last name; address: apartment number, house number, etc.).
  • Multi-valued attributes: Multiple values (e.g., car color: red and gray).
  • Notation:
    • Composite attributes: Parentheses and comma-separated list: (component1, component2).
    • Multi-valued attributes: Curly braces: {value1, value2}.

Nested Attributes

  • Composite and multi-valued attributes can be nested.
  • Example: Storing previous degrees of a student.
    • Degree information: college, year, degree, and field.
    • A student can have multiple degrees.
    • Example Value: {(ASU, 2012, BS, Software Engineering), (ASU, 2016, Masters, IT)}

Entity Types and Key Attributes

  • Entity type: A grouping of similar objects sharing the same attributes.
    • Examples: Employee (collection of all employees), project (collection of all projects).
  • Key attributes: Attributes that must have a unique value for each entity.
    • Example: SSN of an employee.
  • Notation: Underlining the key attributes.
  • Multiple key attributes are possible.
    • Example: Car entity type: VIN number, vehicle tag number (license plate number, which is composite: number, state).
  • In conceptual design, focus on identifying key attributes; the primary key is chosen later in the relational data model.

ER Diagram Notation

  • Entities: Rectangle.
  • Attributes: Oval.
  • Multi-valued attributes: Concentric oval symbol.
  • Composite attributes: Hierarchy of attributes.
  • Key attributes: Underlined.

Entity Sets

  • Entity set: Instance data - all instances of a particular entity type.
  • Example: Car entity set with car1, car2, car3. Each with values for all of the attributes.
  • Parallel to object-oriented programming: Entity type is like a class, and entity set is like the instances of the class.
  • Value sets are the domains of the attributes.
    • All possible values that an attribute can contain. Example: last name as a string with up to 15 characters.

Formal Definition of Attributes

  • Attribute AA for an entity type EE, whose value set is VV, is defined as:
  • A:EP(V)A: E \rightarrow P(V)
  • P(V)P(V) is the power set of V.

ER Diagram Symbols

  • Entities: Rectangle.
  • Weak Entities: Concentric rectangle symbol.
  • Attributes: Oval.
  • Key Attributes: Underlined.
  • Multi-valued Attributes: Concentric oval symbol.
  • Composite Attributes: Attribute hierarchy.
  • Derived Attribute: Dotted oval symbol (can be derived from other data).
  • Relationships: Diamond symbol.

Derived Attributes

  • Derived attributes are not necessary to store in the database; they can be derived from other data.
  • Example: Number of employees in a department.
  • Notation: Dotted oval symbol.

Company Database - Initial Entity Types and Attributes

  • Initial entity types: Department, project, employee, and dependent.
  • Department: Attributes: department name, department number (key attributes), the department manager and manager start date, and locations (multi-valued attribute).
  • Project: Attributes: project name, number, the controlling department, and location.
  • Employee: Attributes: SSN (key attribute), name (composite - first name, middle initial, last name), department, date of birth, address, supervisor, works on (multi-valued attribute with components Project and Hours).
  • Dependent: Attributes: dependent name, employee, sex, date of birth, and relationship.

Library Database System Example

  • Requirements:
    • The library is organized into multiple sections (children, computer science, science, etc.).
    • Every section has a name and a number, headed by a librarian.
    • Each book title belongs to a section.
    • Books have a title, authors, ISBN, call number, year, and publisher.
    • For each copy of the book, track the current borrower, due date, and librarian who checked it out.
    • Members have a membership number or driver's license, address, phone number, and date of birth.
    • Members can borrow up to five books and put a hold request on a book.
    • Information about librarians: name, SSN, address, and phone number.
  • Key entities: book titles, members, copies of books, librarian, section.

Library Database System - Entities and Attributes

  • Title:
    • call number (KEY).
    • ISBN (KEY).
    • title.
    • author (multi-valued & composite -> name, order. The name itself has first name, middle name, and last name as attributes.
  • Member:
    • name.
    • driver's license.
    • date of birth.
    • membership number (KEY) or driver's license (KEY).
    • address.
    • phone number.
  • Book:
    • book ID.
  • Librarian:
    • name.
    • SSN (KEY).
    • address.
  • Section:
    • name.
    • section number (KEY).

Assumptions and Clarifications for Library Example

  • One author writes one or more titles.
  • Several co-authors.
  • Co-authors write one or more titles.
  • A book is a copy of a title, where a library holds multiple copies of a particular title.
  • A title can have one or more copies of the book.
  • A member places a hold on a particular title.
  • It's not necessary for all members to borrow books at the same time.
  • It's not necessary that all books are always borrowed.

ER Diagram for Library Database

  • Using notation, represents entities (rectangle), attributes (oval, key attributes underlined).
  • ER Diagram will be expanded to show relationships.

Relationships

  • A relationship relates two or more distinct entities with a specific meaning.
  • Example 1: Employee John Smith works on product X project. Employees work on projects.
  • Example 2: Employee Franklin Wong manages the research department.
  • Binary relationships: most common; two entities are involved. However, relationships can have more than two participating entity types as well.

Relationship Instances

  • Instances and how the relationship works between instances are important:
    • Employee (entity set of employee (e7 etc)) --works_for--> Department (entity set department (d1 etc)).
    • It connects a distinct employee with a distinct department, such as $e1$ works for department $d1$ etc.
    • N:1 (many-to-one): Many employees can be working in one department, and each department can have $N$ number of or many employees.
    • E.g., $e1$ works only for $d1$, $e2$ works only for $d2$, while $d1$ can have $3$ employees, etc.
  • M:N (Many-to-Many):
    • Employee (entity set) --works_on--> Project (entity set).
    • Employee can work on multiple projects, and projects can have multiple employees.
    • E.g., $e1$ works on $p1$, $e2$ works on $p1$ and $p4$ etc, while $p3$ has two employees, etc.

Relationship Types and Sets

  • Relationship type: schema description.
  • Relationship set: actual current state of the database.

ER Diagram Notation for Relationships

  • Rectangle for entities, oval for attributes, diamond for relationship type.
  • The diamond symbol gives the name of the relationship type between the participating entities.
  • Participation constraints can be shown as well; for e.g.,