Notes on Modeling Historical Data
Overview of Historical Data Modeling
Purpose of Modeling Historical Data
- Definition: Historical data refers to data that records changes over time.
- Example: Tracking a person's height at different ages is a simple form of historical data recording. Similarly, businesses track historical data to identify trends and make process improvements.
- Example in Business: A video store tracks movie rental history to determine popular movies and adjust inventory accordingly.
Importance of Historical Data
- Trend Analysis: Understanding historical data helps in recognizing patterns that can lead to business innovations.
- Cost Management: Storing unnecessary historical data can incur costs related to storage and resource allocation.
- E.g., database space requirements and the need for employee resources to manage the data.
Audit Trail and Data Integrity
- Audit Trail: A step-by-step record of changes made to data, crucial for maintaining data integrity.
- Validation Requirements: It is essential to validate the requirements for storing historical data with users, ensuring data relevance and necessity.
Designing Models for Historical Data
Challenges in Modeling
- Normal Form Violation: Simply storing previous salary amounts can lead to multivalued attributes, which violates First Normal Form (1NF).
- Employee Salary Tracking: One approach involves storing current salary and optional end dates for salaries, ensuring an employee has only one current salary.
Modeling Rental History
- Jewelry Rental Example: A jewelry store rents items (e.g., necklaces) to movie stars. To effectively track rental history, a many-to-many (M:M) relationship between the entities JEWELRY PIECE and MOVIE STAR should be established.
- Intersection Entity: This M:M relationship must be resolved with an intersection entity, such as RENTAL HISTORY, which tracks attributes like rental date and date returned.
Unique Identifiers (UIDs)
- UID Importance: The UID for RENTAL HISTORY must ensure unique combinations of movie star IDs, jewelry piece codes, and rental dates to prevent duplicate entries.
- Example: If Scarlet Johansson rents the Princess necklace, she cannot rent it on another date simultaneously.
- Example 2: Different movie stars can rent the same jewelry piece on different dates, as long as the combinations of IDs and dates are unique.
Conclusion and Key Learning Points
- Historical Data Necessity: Understand the need to track data that changes over time.
- Constructing ERD Models: Learn to construct Entity-Relationship Diagrams (ERDs) that incorporate elements reflecting data changes over time.
- Choice of UID: Identify and justify the selection of UIDs for entities storing historical data.
Key Terminology
- Audit Trail: A detailed log of data changes to maintain a clear path of historical modifications.
- Historical Data: Data that reflects changes over a certain period, crucial for analysis and reporting.