pert 0708 2021

BINUS UNIVERSITY OLAP Overview

  • Session 07-08: Innovation Excellence

Learning Outcomes

  • LO 1: Define the basic concepts, architecture, and techniques of data warehouse and data mining.

  • LO 3: Demonstrate how to design a data warehouse.

  • LO 4: Analyze the strategy of design and implementation of data warehouses and their suitability to needs.

Acknowledgments

  • Content adapted from Parteek Bhatia. (2019). Data Mining and Data Warehousing: Principles and Practical Techniques. Cambridge University Press, UK, Chapter 14.

Introduction to OLAP

  • Key Topics:

    • Introduction to OLAP

    • Representation of Multi-dimensional Data

    • Types of OLAP Servers

    • OLAP Operations

What is OLAP?

  • Definition: OLAP (Online Analytical Processing) is a design paradigm providing methods to extract useful information from physical data stores.

  • Functionality: It aggregates information from multiple systems, offering summarized views for management.

  • Comparison to Data Mining: While OLAP focuses on summarizing data, data mining analyzes hidden patterns within data.

  • Use in Decision Making: It helps forecast and answer operational questions, e.g., average regional sales of cycles by year.

Characteristics of OLAP

  • Definition by Dr. E.F. Codd (1993): OLAP is the dynamic synthesis, analysis, and consolidation of large volumes of multi-dimensional data.

  • Complex Queries: OLAP queries can be complex, as seen in examples involving hypothetical changes in property sales due to tax and legal cost adjustments.

Applications of OLAP

  • Functional Areas:

    • Marketing: Market research analysis, sales forecasting, customer and promotion analysis.

    • Finance: Budgeting, financial modeling, performance analysis.

    • Sales: Sales forecasting and analysis.

    • Manufacturing: Production planning and defect analysis.

Features of OLAP

  • Multi-dimensional Data Representation: Support for viewing data across multiple dimensions (e.g., sales by region, product category, and time).

  • Time Intelligence: Allows for performance analysis over time, enabling comparisons like current month vs last month.

  • Complex Calculations Support: Provides capabilities for advanced computations such as percentage growth and moving averages.

Benefits of OLAP

  • Increased Productivity: Enhances end-user productivity and organizational data integrity.

  • Reduced IT Backlog: Eases the development backlog for IT staff.

  • Profitability Improvements: Potential revenue increases through better decision-making.

  • Reduced Network Traffic: Lowers query load on data warehouses or OLTP systems.

Strengths of OLAP

  • Visualization Tool: Acts as a powerful tool for visualizing data and identifying trends.

  • Speed: Offers fast, interactive responses to user queries.

  • Vendor Availability: Many vendors provide OLAP tools, enhancing accessibility.

  • Outlier Detection: Useful in identifying outliers and clusters in data.

Comparison of OLTP and OLAP

Characteristics

  • Type of Users: OLTP for office workers; OLAP for decision-makers.

  • Number of Users: OLTP supports thousands; OLAP typically supports fewer users, often managers.

  • Functions: OLTP supports daily activities; OLAP supports decision-making through analytical investigations.

  • Processing Types: OLTP involves repetitive processing; OLAP uses ad hoc, heuristic processing.

  • Query Nature: OLTP processes one record at a time; OLAP processes multiple records for aggregation.

  • Design Focus: OLTP is application-oriented; OLAP is subject-oriented.

  • Database View: OLTP is relational; OLAP is multi-dimensional.

  • Transaction Throughput: OLTP has high transaction throughput; OLAP has lower transaction throughput.

Representation of Multi-Dimensional Data

  • Queries can efficiently retrieve single values (e.g., revenue for a specific city).

  • More complex queries require aggregation and can return multiple values (e.g., annual revenue for each city).

Data Cubes in OLAP

  • Definition: Data cubes represent data across dimensions and facts, utilized for slicing, dicing, and aggregating data.

  • Measures: Numeric values that users want to analyze; common functions include Min, Max, and Count.

OLAP Server Types

Relational OLAP (ROLAP)

  • Characteristics:

    • Based on the relational model.

    • Supports existing RDBMS.

    • Efficient storage without returning zero facts from SQL queries.

    • Needs real-time processing but may have poor query performance.

Multi-dimensional OLAP (MOLAP)

  • Characteristics:

    • Uses multi-dimensional DBMS for better storage and access.

    • Offers rapid indexing of pre-computed data.

    • Not designed for detailed data storage; easier for inexperienced users.

OLAP Operations

  • Roll-up: Zooming out for abstract details.

  • Drill-down: Zooming in for detailed data.

  • Slice: Extracts sub-cubes by selecting one dimension.

  • Dice: Extracts sub-cubes by selecting multiple dimensions.

  • Pivot (Rotate): Changes data presentation by rotating axes.

Conclusion

  • OLAP is an essential component of business intelligence, helping organizations transform data into actionable insights through multi-dimensional analysis.