Conceptual Modeling of Data and ER Model
Database Design Process
- Data Modeling: The database design process starts with data modeling.
- RDBMS Relations Translator: Transforms the high-level model into RDBMS relations.
- High-Level Model: A conceptual representation of the database.
Database Design: A 6-Step Program
- Requirements Analysis:
- Data requirements and critical operations are identified.
- Conceptual DB Design:
- A high-level description of data and constraints using the ER model.
- Logical DB Design:
- Conversion into a schema.
- Selection of a DBMS (Relational DBMS is most popular).
- Schema Refinement:
- Normalization to eliminate redundancy.
- Physical DB Design:
- Consideration of workloads, indexes, and data clustering.
- Application/Security Design
Overview of Entity-Relationship (ER) Model
- A visual model to specify:
- What information the database must hold.
- Relationships among components of that information.
- Proposed by Peter Chen in 1976.
- Topics Covered:
- Basic elements (entities, relationships, roles).
- Constraints.
- Weak entity sets.
- Multi-way relationships.
- Subclass/superclass relationships.
- Aggregation.
ER Model
Entity: A real-world object distinguishable from other objects (e.g., an employee, John Yates).
- Described using a set of attributes.
Entity Set: A collection of similar entities (e.g., all employees).
- Entities in an entity set have the same set of attributes.
- Each entity set has a key (underlined).
- Each attribute has a domain.
Illustration: An entity set
Employeeswith attributesssn,name, andsalary.ssnis the key.Analogy to Object-Oriented (OO) Terminology:
- Entity set ≈ Class
- Entity ≈ Instance
Entity Set - Attributes
Single-Valued vs. Multi-Valued:
- Single-valued Example: salary of an employee.
- Multi-valued Example: phone number of an employee (can have multiple).
Atomic vs. Composite:
- Atomic Example: salary of a person.
- Composite Example: address of a person (street, city, etc.).
Stored vs. Derived:
- Derived Attributes: Can be derived from other attributes or entities.
- Example: age can be derived from date of birth.
- Stored Attributes: All other attributes that are not derived.
Illustration:
Employeesentity set with attributesphone(multi-valued),salary(single-valued),age(derived),number(atomic), andaddress(composite, withcityandstreetcomponents).- Derived Attributes: Can be derived from other attributes or entities.
Entity Set: Keys and Domains
Key Attributes of an Entity:
- A key is a set of attributes that uniquely identifies an entity in an entity set.
- Denoted in ER diagrams by underlining the attributes that form a key.
- Examples:
{ssno}as a key for theEmployeesentity set.{emp_id}as another key for theEmployeesentity set.
- Multiple keys may exist; one is chosen as the primary key (underlined), and others are candidate keys.
*Illustration:
Employeesentity set with attributesname,emp_id, andssn(underlined, indicating it's a key).Departmentsentity set with attributesdname,budget, anddept_id(underlined, indicating it's a key).Classroomentity set with attributesbuilding,capacity, androomNumber(composite key).Domains of Attributes:
- A domain is associated with each simple attribute.
- The value of the attribute for the entity is constrained to be within the domain.
ER Model - Relationships
Relationship: Association among two or more entities (e.g., John Yates works in the EECS department).
Relationship Set: Collection of similar relationships.
- Relationships may have their own attributes.
- Can be Binary, Ternary, 4-nary, … relationship sets
Illustration:
Employeesentity set related toDepartmentsentity set via theWorks_Inrelationship, with attributesince. Attributes lot, name, ssn forEmployees, Works_In hassince, dname, budget, did forDepartments
ER Model – Relationships (cont.)
A binary relationship involves two entities.
A ternary relationship involves three entities.
An n-ary relationship set R relates n entity sets E1 … En; each relationship in R involves entities e1 ∈ E1, …, en ∈ En.
Illustrations of Binary (Employees WorksIn Departments) and Ternary (Employees WorksIn Departments Division) Relationship Sets.
ER Model – Relationships (cont.)
The same entity set can participate in different relationship sets, or in different “roles” in the same relationship set.
The function that an entity plays in a relationship is called its role.
- Roles are normally not explicitly specified unless the meaning of the relationship needs clarification.
- Roles are needed when an entity set is related to itself via a relationship.
*Illustration:
Employeessupervises otherEmployees(roles: subordinate, supervisor) through thesupervisesrelationship with attributesince. EmployeesWorks_InDepartments includesdname,budgetanddid
Constraints on Relationship Sets
Limit the possible combinations of entities that may participate in the corresponding relationship set.
Key Constraints (or Multiplicity):
- Example: An employee is allowed to manage more than one department, but each department has at most one manager (one-to-many).
Participation Constraints:
- Not all employees are managers (partial participation).
- Each department needs to have a manager (total participation).
Illustration:
EMPLOYEESMANAGESDEPARTMENTSwith dates. Partial and total participation.
Key Constraints (Multiplicity)
Consider binary relationship set R between entity sets A and B.
- One-to-One: An entity in A is associated with at most one entity in B, and vice versa.
- Many-to-One: An entity in A is associated with at most one entity in B, and an entity in B is associated with many entities in A.
- Many-to-Many: An entity in A is associated with many entities in B, and vice versa.
Examples: married-to (one-to-one), works-in (many-to-one), customer-accounts (many-to-many).
Key Constraints (Multiplicity)(cont.)
- Many-to-many: An employee can work in many departments; a department can have many employees.
- One-to-many: Each department has at most one manager, but one employee can manage multiple departments.
- Multiplicity in ER diagram is represented by an arrow pointing from “many” to “one.”
- worksin relationship: arrow pointing from worksin to departments.
- manages relationship: arrow pointing from manages to department.
Relationship Attribute in a Many to One Relationship
- In a Many-to-One relationship, relationship attributes can be repositioned to the entity set on the many side.
Illustrations on movingsincebetween relations betweenemployee,manages, anddepartment
Relationship Attribute in a One to One Relationship
- In a One-to-one relationship, relationship attributes can be shifted to either of the entity sets.
Illustrations on movingsincebetween relations betweenemployee,manages, anddepartment
Participation Constraints
- Total vs. Partial participation.
- Example: Suppose each professor must teach at least one course. Then, participation of entity set professor in the relationship teaches with the entity set courses is total.
- Total Participation: Each entity in entity set A is constrained to be related to other entities via relationship R1.
- Partial Participation: Not every entity in entity set B is constrained to be related to other entities via relationship R1.
- In ER diagram, total participation represented using a double line between the relationship and the entity set that totally participates in the relationship.
Illustrations using entity sets A B with relation R1
Example-1
- Keys: ssn, dID
- Key constraint: Each department has a single manager.
- Participation constraints:
- Each employee works in at least one department.
- Each department has at least one employee.
- Each department has a manager.
Illustrations using entity setsDepartmentsandEmployeesand relationManages
Example-2
- Keys: ssn#, loanid, branchid
- Key constraint: Each loan belongs to a single branch.
- Participation constraints:
- Each customer must be a customer of at least one branch.
- Each loan must belong to some branch.
Weak Entity Sets
- Entity sets that do not have sufficient attributes to form a key are called weak entity sets.
- A weak entity can be identified uniquely only with the primary key of another (owner) entity.
- Owner entity set and weak entity set must participate in a one-to-many relationship set (one owner, many weak entities).
- Weak entity set must have total participation in this identifying relationship set.
- A weak entity set may have a partial key that distinguishes between weak entities related to the same strong entity.
- key of weak entity set = key of owner entity set(s) + partial key
Weak Entity Sets - Example
- Transaction is a weak entity set related to accounts (owner) via “log” identifying relationship.
- trans# is the partial key that distinguishes different transactions on the same account.
- In ER diagram:
- Weak entity is identified with double rectangle.
- The relationships set connecting the weak entity set to the owner entity is depicted by a double diamond.
- “partial key” is identified with dashed underline.
- Key of “transaction” = (acctnumber, trans#)
Illustrations using entity setscustomer,custacct,account, andtransaction
Weak Entity Sets (cont.)
- What if we add acctnumber to “transaction”?
- The acctnumber information in “transaction” will be redundant.
- What if we add acctnumber to “transaction” and remove relationship “log”?
- The relationship between “account” and “transaction” becomes implicit in an attribute, which is not desirable.
A Chain of Weak Entity Sets
- Names of states are unique and form the key.
- Names of cities are unique only within a state (e.g., 24 Springfields within United States).
- Names of streets are unique within a city. Multiple cities could have streets with the same name.
- Example: a weak entity set might itself participate as owner in an identifying relationship with another weak entity set.
Illustrations using entity setsstate,city,street, and relationslocated_in
A Weak Entity Set with Multiple Owner Entity Sets
- Reviewers review movies and assign a rating for each movie – thumbs up or thumbs down.
- Review is a weak entity set whose owner sets correspond to both the movie and the reviewer entity sets.
- Key for the review entity set = key of movie + key of reviewer
Illustrations about relationships betweenmovie,reviewer, andreview
Weak Entity Sets
- Caution! Don’t Overuse Weak Entity Sets
- Beginning database designers often tend to make all entity sets weak, supported by all other entity sets to which they are linked.
- In reality, we usually create unique IDs for entity sets.
- Examples include Social Security numbers, automobile VINs, registration numbers, etc.
Multiway Relationships
- Usually binary relationships (connecting two entity sets) suffice.
- However, there are some cases where three or more entity sets must be connected by one relationship.
- Similar to binary relationships, key and participation constraints can be defined over multiway relationships.
Illustration - Relations betweenbranch,customer, andaccount
Key Constraint over Ternary Relationships
- Interpretation:
- Each customer is associated to a single account and a single branch.
Representing Ternary Relationship Using Binary Relationships
- A multiway relationship can be converted to a collection of binary, one-to-many relationships.
- First Attempt: Can the CAB relationship set be represented using the ER diagram on the right?
- These binary relationships AB, CA, and CB do not correctly capture the information represented by the ternary relationship.
*Illustration - Attempts to represent the ternary relationship *
- These binary relationships AB, CA, and CB do not correctly capture the information represented by the ternary relationship.
Representing Ternary Relationship Using Binary Relationships
- Introduce a new entity set whose entities we may think of as the tuples of the multiway relationship.
- The CAB relationship is represented as a weak entity set that depends upon the customer, branch, and account entity sets.
- This schema using binary relationship fully captures the ternary relationship.
*Illustration - Attempts to represent the ternary relationship *
Representing Ternary Relationship Using Binary Relationships
- Previous mapping technique works for many-many-many relationship.
- How to convert the many-many-1, many-1-1, 1-1-1 ternary relationships into binary relationships?
- In general, it is always possible to convert any ternary (or multiway relationship) into a collection of binary relationships without losing information!!
- However, the conversions can be quite complex and resulting unnatural schemas.
Limitations of the Basic ER Model Studied So Far
- How to represent different account types in the ER model:
- Possible Approach:
- Associate an attribute -- account-type with the accounts entity set.
- Problems:
- Different attributes may be associated with the account depending on its type:
- checking: overdraft amount
- savings: interest rate
- Depending upon its type, savings and checking accounts may participate in different relationships.
- Different attributes may be associated with the account depending on its type:
Limitations of the Basic ER Model Studied So Far
- How to represent this in the ER model:
- Approach 2:
- entity sets: checking, savings, and accounts.
- relationships: 1-1 between checking and accounts, and 1-1 between savings and accounts
- Problems:
- Not intuitive: checking and savings are represented as entities different from accounts, even though they are accounts.
- Redundancy of information: info about accounts represented both in checking / savings as well as account entity set
- Potential Errors: Same account could be erroneously associated with both checking as well as savings.
Subclass/Superclass Relationships
- “savings” and “checking” are subclasses of the account entity set.
- “account” is a superclass of savings and checking entity sets.
- An entity in a subclass has to belong to the superclass as well
- Every savings account is also an account
- Every checking account is also an account
- Attribute and Relationship Inheritance:
- As in OO languages, subclasses inherit all the attributes of the superclass. Similarly, subclasses inherit all relationships in which the superclass participates.
- The key of the entity set corresponding to the subclass is the same as the key for the superclass.
- ISA relationship is one-to-one, although we don’t draw two arrows on the two sides.
Specialization and Generalization
- Specialization: process of classifying a class of objects into more specialized subclasses
- E.g., during design, we begin with an account entity set. We then specialize the account set into different types of accounts.
- Generalization: Reverse of specialization -- it is a process of synthesis of two or more (lower level) entity sets to produce a higher-level entity set.
- E.g., during design, we have identified a car, a sports utility vehicle, and a truck. We generalize these classes to create an automobile entity set.
Types of Class/Subclass Relationships
- Disjoint vs. Overlapping:
- If the subclasses of the entity set do not overlap, then it is disjoint.
- default
- Else, overlapping.
- denoted by a ‘o’ next to ISA triangle
- If the subclasses of the entity set do not overlap, then it is disjoint.
- Total vs. Partial (also called “covering”):
- If each entity in a superclass belongs to at least one of the subclasses, then total.
- denoted by a double line from superclass to ISA triangle
- Else, partial
- If each entity in a superclass belongs to at least one of the subclasses, then total.
Superclass/Subclass Lattice
- Class/Subclass relationships might form a hierarchy (tree) or a lattice
Back to Limitations of Basic ER Model
- Suppose projects are sponsored by one or more departments.
- A department that sponsors a project might assign an employee to monitor the sponsorship.
Illustrations of bad designs.
Limitations of Basic ER Model
- The “monitors” should be a relationship that associates “Sponsorship” to “employees.”
- Treat the relationship set sponsors and the entity sets projects and departments as a higher-level entity set-- an aggregate entity set.
- Permit relationships between aggregate entity sets and other entity sets.
Aggregation
- Without introducing redundancy, the following diagram represents:
- A project is sponsored by a particular department.
- A sponsorship (project, department combination) may have an associated monitoring employee.
Review of ER Model
- Basic Model:
- Entities: strong, weak
- Attributes associated with entity sets and relationships
- Relationships: binary, ternary, …
- Role of entity sets in a relationship
- Constraints on the entity set: domain constraints, key constraint
- Constraint on relationships: key constraints, participation --total vs partial.
- Extended Model:
- Notion of superclass and subclass
- Superclass/subclass relationships: disjoint vs overlapping, total vs partial
- Notion of aggregation
Schema Design Issues
- Observation: There may be multiple ER schemas describing the same target database.
- Decisions that need to be made:
- whether to use an attribute or entity set to represent an object
- whether to model a concept as a relationship or an entity set
- whether to use a ternary relationship or a set of binary ones
- whether to use a strong entity set or a weak entity set
- whether using generalization/specializations is appropriate
- whether using aggregates is appropriate
- Unfortunately, there are no straightforward answers to these questions.
- No two design teams will come up with the same design.
- However, there are some simple design principles that should be followed during ER design.
Entity vs. Attribute Example-1
- Should “dependent” be an attribute of Employees or an entity (connected to Employees by a relationship)?
- Depends upon the use we want to make of dependent information, and the semantics of the data:
- If we have several dependents per employee, “dependent” must be an entity (since attributes cannot be set-valued).
- If “dependent” has its own attributes, “dependent” must be modeled as an entity (since attribute values are atomic).
Entity vs. Attribute Example-2
- Works_In4 does not allow an employee to work in a department for more than one period.
- We want to record several values of the descriptive attributes for each instance of this relationship.
- Accomplished by introducing new entity set, “Duration.”
Subclass Relationship Example-3
- Assume budgets are associated with managers, not departments.
- 1st model: Not all employees are managers, so most employees won’t have budgets.
ER Design Example-1
- We wish to design a database consistent with the following facts:
- A train has a unique number and has an engineer.
- Trains are either local trains or express trains, but never both.
- Stations are either express stops or local stops, but never both.
- A station has a name (assumed unique) and an address.
- All local trains stop at all stations.
- For each train and each station the train stops at, there is a time.
- Express trains stop only at express stations.
Does not capture all the constraints.
ER Design Example-1 Alternative
- Better Design. It captures express trains and local stations.
ER Design Example-2
- Notown Records would like to store information about musicians who perform on its albums (as well as other company data) in a database.
- Each musician that records at Notown has an SSN, a name, an address, and a phone number. Poorly paid musicians often share the same address. No phone has more than one address, but musicians that share an address might have different phones.
- Each instrument used in songs recorded at Notown has a name (e.g., guitar, synthesizer, flute), a musical key (e.g., C, B-flat, E-flat), and an instrument number which is unique for each instrument name.
- Each album recorded on the Notown label has a unique identification number, a title, and a copyright date.
- Each song recorded at Notown has a title and an author.
- Each musician may play several instruments, and a given instrument may be played by several musicians.
- Each album has a number of songs on it, but no song may appear on more than one album. The song titles in an album are assumed to be unique.
- Each song is performed by one or more musicians, and a musician may perform a number of songs.
- Each album has exactly one musician who acts as its producer. A musician may produce several albums, of course.
Textbook notation
- Symbols used in ER diagrams
Alternative ER notations
- Alternative symbols used in ER diagrams