Notes on Modeling Historical Data

Overview of Historical Data Modeling

Purpose of Modeling Historical Data
  • Definition: Historical data refers to data that records changes over time.
  • Example: Tracking a person's height at different ages is a simple form of historical data recording. Similarly, businesses track historical data to identify trends and make process improvements.
  • Example in Business: A video store tracks movie rental history to determine popular movies and adjust inventory accordingly.
Importance of Historical Data
  • Trend Analysis: Understanding historical data helps in recognizing patterns that can lead to business innovations.
  • Cost Management: Storing unnecessary historical data can incur costs related to storage and resource allocation.
    • E.g., database space requirements and the need for employee resources to manage the data.

Audit Trail and Data Integrity

  • Audit Trail: A step-by-step record of changes made to data, crucial for maintaining data integrity.
  • Validation Requirements: It is essential to validate the requirements for storing historical data with users, ensuring data relevance and necessity.

Designing Models for Historical Data

Challenges in Modeling
  • Normal Form Violation: Simply storing previous salary amounts can lead to multivalued attributes, which violates First Normal Form (1NF).
  • Employee Salary Tracking: One approach involves storing current salary and optional end dates for salaries, ensuring an employee has only one current salary.
Modeling Rental History
  • Jewelry Rental Example: A jewelry store rents items (e.g., necklaces) to movie stars. To effectively track rental history, a many-to-many (M:M) relationship between the entities JEWELRY PIECE and MOVIE STAR should be established.
  • Intersection Entity: This M:M relationship must be resolved with an intersection entity, such as RENTAL HISTORY, which tracks attributes like rental date and date returned.
Unique Identifiers (UIDs)
  • UID Importance: The UID for RENTAL HISTORY must ensure unique combinations of movie star IDs, jewelry piece codes, and rental dates to prevent duplicate entries.
    • Example: If Scarlet Johansson rents the Princess necklace, she cannot rent it on another date simultaneously.
    • Example 2: Different movie stars can rent the same jewelry piece on different dates, as long as the combinations of IDs and dates are unique.

Conclusion and Key Learning Points

  • Historical Data Necessity: Understand the need to track data that changes over time.
  • Constructing ERD Models: Learn to construct Entity-Relationship Diagrams (ERDs) that incorporate elements reflecting data changes over time.
  • Choice of UID: Identify and justify the selection of UIDs for entities storing historical data.

Key Terminology

  • Audit Trail: A detailed log of data changes to maintain a clear path of historical modifications.
  • Historical Data: Data that reflects changes over a certain period, crucial for analysis and reporting.