SI - Parte 2 - DW-2
Overview of Data Warehousing
Definition: A Data Warehouse (DW) is a collection of data that is themed, integrated, non-volatile, and variant over time to support managerial decision-making processes.
Inmon: "A Data Warehouse is a collection of data, oriented by themes... for management decision support."
Devlin: "A Data Warehouse is a unique, complete, and consistent repository of data from various sources that is available to end users in a comprehensible form."
Characteristics of Data Warehousing
Integration of Techniques and Technologies:
Focused on providing systematic access to information distributed across multiple organizational systems.
Gathers heterogeneous data sources including historical data for reporting and decision support.
Ongoing Process:
Collects data over time, giving users the ability to perform structured queries, ad-hoc reporting, and analytical support.
Purpose of a Data Warehouse
Enhancement of Information Utilization:
A DW is not the end; it serves as a means for companies to analyze data, aiding both current processes and future improvements.
Centralized View:
Integrates corporate data into a single repository for holistic access.
Provides a unified and centralized view of data that may be dispersed across various databases.
User Empowerment:
Enables end-users to run queries, create reports, and perform analyses without reliance on IT or technical resources.
Importance of Data Warehousing
Strategic Resource:
Information derived from historical data on sales, production, customers, etc., becomes crucial for strategic decision-making.
Enhances the company's competitive edge through data-driven insights.
Addressing Information Challenges:
Companies face difficulties in extracting valuable information as data volume increases and source inconsistency prevails.
A DW is critical to manage complexities arising from data growing volume and variety.
Essential Features of Data Warehousing
Integration of Multiple Sources:
Combines data from diverse systems while ensuring high-quality information that meets various user needs.
Facilitates analyses without affecting operational data environments.
Flexibility and Agility:
Adapts easily to new requirements for analysis, ensuring continuous support for decision-making processes.
Evolution of Data Warehousing
Historical Context:
Evolution through four generations of computing environments:
Formation: Initial applications tailored to immediate business needs.
Proliferation: Emphasis on effectiveness and efficiency, leading to interconnected systems.
Dispersion: Emergence of end-users using personal tools for information extraction.
Unification: Adoption of a new organizational approach through Data Warehousing.
Data Warehouse Structure and Analysis
Operational vs. Informational Databases:
Accessing data within operational databases is timely but often limited to current snapshots.
Informational databases, contrastingly, allow for historical analysis and support complex queries and decision-making.
Data Warehouse Objectives:
To make organizational information more accessible, presenting data consistently while maintaining quality and supporting inevitable changes in business requirements.
Benefits of Implementing a Data Warehouse
Subject Orientation: Data organized by subject rather than by application, making navigation intuitive.
Integration of Diverse Data: Combines information from many sources, thereby enhancing analytic capability.
Temporal Analysis: Enables analysis over different time horizons, essential for trend analysis.
Ad-Hoc Reporting: Users can generate reports flexibly without heavy reliance on IT support.
Enhanced Analytical Capability: Provides advanced tools for multidimensional analysis, assisting decision-makers.
Reduced IT Development Overload: Allows users to independently obtain information previously inaccessible which lessens the IT burden.
Improved Performance: Streamlined queries lead to better performance during complex analytic tasks, allowing repeatable, predictable analysis.
Non-intrusive to Operational Systems: Minimizes the impact on transaction processing in operational systems.
Transformation of Data to Strategic Information: Empowers companies to effectively leverage information for competitive advantage.
Facilitates Business Process Reengineering: Provides access to underlying data, fostering opportunities for innovation.