pert 0708 2021
BINUS UNIVERSITY OLAP Overview
Session 07-08: Innovation Excellence
Learning Outcomes
LO 1: Define the basic concepts, architecture, and techniques of data warehouse and data mining.
LO 3: Demonstrate how to design a data warehouse.
LO 4: Analyze the strategy of design and implementation of data warehouses and their suitability to needs.
Acknowledgments
Content adapted from Parteek Bhatia. (2019). Data Mining and Data Warehousing: Principles and Practical Techniques. Cambridge University Press, UK, Chapter 14.
Introduction to OLAP
Key Topics:
Introduction to OLAP
Representation of Multi-dimensional Data
Types of OLAP Servers
OLAP Operations
What is OLAP?
Definition: OLAP (Online Analytical Processing) is a design paradigm providing methods to extract useful information from physical data stores.
Functionality: It aggregates information from multiple systems, offering summarized views for management.
Comparison to Data Mining: While OLAP focuses on summarizing data, data mining analyzes hidden patterns within data.
Use in Decision Making: It helps forecast and answer operational questions, e.g., average regional sales of cycles by year.
Characteristics of OLAP
Definition by Dr. E.F. Codd (1993): OLAP is the dynamic synthesis, analysis, and consolidation of large volumes of multi-dimensional data.
Complex Queries: OLAP queries can be complex, as seen in examples involving hypothetical changes in property sales due to tax and legal cost adjustments.
Applications of OLAP
Functional Areas:
Marketing: Market research analysis, sales forecasting, customer and promotion analysis.
Finance: Budgeting, financial modeling, performance analysis.
Sales: Sales forecasting and analysis.
Manufacturing: Production planning and defect analysis.
Features of OLAP
Multi-dimensional Data Representation: Support for viewing data across multiple dimensions (e.g., sales by region, product category, and time).
Time Intelligence: Allows for performance analysis over time, enabling comparisons like current month vs last month.
Complex Calculations Support: Provides capabilities for advanced computations such as percentage growth and moving averages.
Benefits of OLAP
Increased Productivity: Enhances end-user productivity and organizational data integrity.
Reduced IT Backlog: Eases the development backlog for IT staff.
Profitability Improvements: Potential revenue increases through better decision-making.
Reduced Network Traffic: Lowers query load on data warehouses or OLTP systems.
Strengths of OLAP
Visualization Tool: Acts as a powerful tool for visualizing data and identifying trends.
Speed: Offers fast, interactive responses to user queries.
Vendor Availability: Many vendors provide OLAP tools, enhancing accessibility.
Outlier Detection: Useful in identifying outliers and clusters in data.
Comparison of OLTP and OLAP
Characteristics
Type of Users: OLTP for office workers; OLAP for decision-makers.
Number of Users: OLTP supports thousands; OLAP typically supports fewer users, often managers.
Functions: OLTP supports daily activities; OLAP supports decision-making through analytical investigations.
Processing Types: OLTP involves repetitive processing; OLAP uses ad hoc, heuristic processing.
Query Nature: OLTP processes one record at a time; OLAP processes multiple records for aggregation.
Design Focus: OLTP is application-oriented; OLAP is subject-oriented.
Database View: OLTP is relational; OLAP is multi-dimensional.
Transaction Throughput: OLTP has high transaction throughput; OLAP has lower transaction throughput.
Representation of Multi-Dimensional Data
Queries can efficiently retrieve single values (e.g., revenue for a specific city).
More complex queries require aggregation and can return multiple values (e.g., annual revenue for each city).
Data Cubes in OLAP
Definition: Data cubes represent data across dimensions and facts, utilized for slicing, dicing, and aggregating data.
Measures: Numeric values that users want to analyze; common functions include Min, Max, and Count.
OLAP Server Types
Relational OLAP (ROLAP)
Characteristics:
Based on the relational model.
Supports existing RDBMS.
Efficient storage without returning zero facts from SQL queries.
Needs real-time processing but may have poor query performance.
Multi-dimensional OLAP (MOLAP)
Characteristics:
Uses multi-dimensional DBMS for better storage and access.
Offers rapid indexing of pre-computed data.
Not designed for detailed data storage; easier for inexperienced users.
OLAP Operations
Roll-up: Zooming out for abstract details.
Drill-down: Zooming in for detailed data.
Slice: Extracts sub-cubes by selecting one dimension.
Dice: Extracts sub-cubes by selecting multiple dimensions.
Pivot (Rotate): Changes data presentation by rotating axes.
Conclusion
OLAP is an essential component of business intelligence, helping organizations transform data into actionable insights through multi-dimensional analysis.