Lecture recording on 08 January 2025 at 14.10.18 PM

Requirements

Business requirements (break down in 4 sections);

  1. Data requirements

  2. Functional requirements 

  3. Technical requirements 

  4. Regulatory & Compliance requirements (non-negotiable)

    Technical Requirements

    • Importance of aligning new technology with existing infrastructure.

    • Considerations involve current programs and compatibility with new solutions.

    • Need for technology to integrate efficiently into existing systems.

Data Governance Regulatory Requirements

  • Laws and regulations to adhere to, such as:

    • USA Patriot Act

    • EPA regulations

    • GDPR (General Data Protection Regulation)

      • Regulatory framework governing data privacy in the EU.

      • Requires organizations to protect personal data and privacy.

      • Individuals have the right to request their data and demand deletion.

Project Management Methodologies

  • Moscow Principle: Prioritization method:

    • Must Have

    • Should Have

    • Could Have

    • Will Not Have

  • Use this framework to categorize and prioritize project requirements.

  • Discussing with clients and stakeholders to align on priorities is essential.

  • Replace outdated systems with new solutions, starting with reverse engineering to understand current systems.

Reverse Engineering

  • Process of examining existing reports to identify necessary components for replacement.

  • Analyze the required data and technologies needed for effective report generation.

Client Engagement

  • Necessary to align outcomes with client expectations regarding reports to be generated.

  • Confirm understanding with client about what components should be retained or revised in new reports.

Warehouse And Data

  • Data Warehouse Functions

    • Data warehouses store integrated information over time for analytics and business intelligence.

    • They hold historical data for trend analysis, allowing organizations to track changes over time.

    • These systems create a single source of truth for organizational data, essential for accurate reporting and decision-making.

  • OLTP vs OLAP

    • Understand the difference between Online Transactional Processing (OLTP) and Online Analytical Processing (OLAP) systems:

      • OLTP: Used for daily transactions and real-time processing.

      • OLAP: Used for complex queries and data analysis.

Data Sources and Integration

    • Importance of integrating data from multiple sources, including ERP systems.

    • Data integration helps facilitate comprehensive reporting and business intelligence.

    • Data warehouses act as a central repository for combined data.

Master Data

  • Master Data Management

    • Master data is a key aspect of data management focusing on the core entities within the business (customers, products, etc.).

    • Effective master data management ensures consistency and accuracy across records.

    • Understanding relationships between different data records is crucial for maintaining data integrity.

Most Data Management

  • Data Management Objectives

    • Focus on maintaining accurate records, correcting errors, and ensuring quality data across systems.

    • Most data management is essential for vendor relations and organizational efficiency.

  • Reporting and Visualization

    • Distinctions between reports (table of data) and dashboards (visual representation of key metrics) are critical.

    • Self-service reporting options enable users to access needed information more effectively.

Integration and Predictive Features

  • Power BI's integration capabilities with Microsoft applications enhance usability and efficiency.

  • Predictive analytics allow organizations to forecast needs (e.g., stock levels) and guide decision-making.

User Experience and Accessibility

  • Focus on creating intuitive user interfaces that enable mobile access to data.

  • Encourage consistent layouts for reports and dashboards to improve user familiarity and efficiency.

  • Implement responsive design principles to ensure that all features are easily accessible on various devices, enhancing the overall user experience.

Data Warehouse + Data Lake

  • Data sources

  • Staging area

  • Warehouse

  • Data marts

  • Users

Why is it called the Warehouse Layer?

  • It’s where the cleaned and organized data is stored permanently.

  • Just like a physical warehouse stores goods in an orderly way, this layer stores structured data, ready to be accessed for analysis.

Data lake: A data lake is like a giant storage pond for data. It holds raw, unprocessed data from various sources in its original format. Think of it as a "dump everything in one place" strategy for data storage.

Key Features of a Data Lake

  1. Stores All Types of Data:

    • Structured data (like tables in a database).

    • Semi-structured data (like JSON or XML files).

    • Unstructured data (like videos, images, PDFs, or text files).

  2. Schema-on-Read:

    • You don’t need to organize or process the data when it’s stored.

    • The structure is applied only when you access or analyze the data.

  3. Scalable and Flexible:

    • Data lakes can handle huge amounts of data from many sources.

    • Great for future-proofing since you can store data without knowing its purpose yet.

How It’s Different from a Data Warehouse

Feature Data Lake Data Warehouse

Data Type

Raw, unprocessed

Cleaned, structured

Purpose

Store everything for later use

Store data for specific analysis

Cost

Cheaper for storage

More expensive for processing

Use Case

Data scientists & advanced analysis

Business reporting & decision-making

Example to Understand It

Imagine you’re a scientist collecting data about weather:

  • Data Lake: Like throwing all your observations (photos, logs, temperature readings) into a big box. You don’t organize it yet—you’ll figure it out later when you analyze it.

  • Data Warehouse: Like carefully organizing your data into tables, graphs, and summaries for a weather report.

When Do You Use a Data Lake?

  • When you want to store a lot of data cheaply for potential future use.

  • For big data analytics, machine learning, or AI projects.

  • When your data is messy or comes from many different sources.

Data lakehouse: a unified platform that combines the benefits of data lakes and data warehouses, allowing organizations to store structured and unstructured data while providing robust analytics capabilities. This approach enables users to efficiently query and analyze large volumes of data, facilitating better decision-making and insights.

  • Key advantages include scalability, flexibility, and cost-effectiveness, making it an ideal choice for organizations looking to harness the full potential of their data.

Essa

  • Essa

  1. Eliminate: Remove unnecessary steps or processes that don’t add value to make things more efficient.

  2. Simplify: Make the remaining processes easier to understand and execute by reducing complexity.

  3. Standardize: Create consistent methods or practices to ensure uniformity and efficiency across similar tasks.

  4. Automate: Use technology to perform repetitive tasks automatically, reducing manual effort and the chance for errors.

Version 2

Technical Requirements

  • It is crucial to align new technology initiatives with existing infrastructure to ensure that organizations can reap the full benefits of both the old and new systems. This involves a thorough analysis of current programs and their compatibility with proposed solutions. The challenge lies in integrating new technology efficiently into existing systems without jeopardizing performance, security, or compliance.

Data Governance Regulatory Requirements

  • Organizations are obligated to adhere to a variety of laws and regulations regarding data governance. Key regulations include:

    • USA Patriot Act: This act allows for the collection of data for national security purposes and requires organizations to ensure data privacy standards are upheld.

    • EPA Regulations: These regulations govern the environmental impact of data handling and require organizations to maintain specific protocols to minimize ecological harm.

    • GDPR (General Data Protection Regulation): This regulation is comprehensive in its requirements, imposing strict rules on how personal data of EU citizens can be collected, processed, and stored. Crucial aspects include:

      • Organizations must implement appropriate security measures to protect personal data.

      • Individuals have the right to access their data and request its deletion if appropriate.

Project Management Methodologies

  • The Moscow Principle is a widely utilized project management framework that helps teams prioritize project requirements to ensure successful delivery. It categorizes requirements into four key areas:

    • Must Have: Critical requirements that are essential for project success.

    • Should Have: Important requirements that add significant value but are not critical.

    • Could Have: Desirable requirements that can enhance the project if time and resources permit.

    • Will Not Have: Requirements that are agreed upon as unnecessary for the current project phase.

  • Collaborating with clients and stakeholders to continuously align on priorities is essential to maintain clarity and scope. This process often involves replacing outdated systems by first reverse engineering to understand the current systems thoroughly.

Chapter 2: Need A Report

Reverse Engineering

  • Reverse engineering is an analytical process aimed at examining existing reports and data systems to identify components that require replacement or upgrading. This method helps determine:

    • The necessary data elements and technologies for effective report generation.

    • A full audit of current reporting structures to identify inefficiencies or opportunities for enhancement.

Client Engagement

  • Engaging with clients is critical to achieving the desired outcomes in report generation. It is necessary to:

    • Confirm with the client their understanding of which components of existing reports should be retained, revised, or replaced in the new reports.

    • Establish clear communication channels to address client expectations and concerns throughout the reporting process.

Chapter 3: Warehouse And Data

Data Warehouse Functions

  • Data warehouses serve as centralized repositories that store integrated information over extended periods, facilitating analytics and business intelligence. They:

    • Preserve historical data necessary for trend analysis, allowing organizations to observe changes and patterns over time.

    • Create a single source of truth, which is vital for accurate reporting, compliance, and sound decision-making.

OLTP vs OLAP

  • Understanding the differences between Online Transactional Processing (OLTP) and Online Analytical Processing (OLAP) systems is fundamental to data management.

    • OLTP: Designed for managing day-to-day transactions, it focuses on real-time processing of data, engaging in numerous short online transactions.

    • OLAP: Tailored for complex queries and in-depth data analysis, OLAP enables users to execute multi-dimensional queries and generate meaningful reports based on historical and aggregated data.

Chapter 4: Business And Data

Data Sources and Integration

  • The integration of data from various sources, including Enterprise Resource Planning (ERP) systems, is critical for comprehensive reporting and analytics. Data integration processes help to:

    • Eliminate data silos, ensuring a holistic view of the business.

    • Enhance business intelligence capabilities by providing accurate and timely information from a centralized data warehouse.

Chapter 5: Master Data

Master Data Management

  • Master data management focuses on the primary entities within a business—such as customers, products, and suppliers. Effective management of master data ensures:

    • Consistency and accuracy across all records, preventing data quality issues that can lead to erroneous reports and analysis.

    • A comprehensive understanding of the relationships between different data records, crucial for maintaining data integrity and compliance.

Chapter 6: Most Data Management

Data Management Objectives

  • Data management objectives should focus on maintaining accurate records, correcting data entry errors, and ensuring high-quality data across all systems. This includes:

    • Regular audits and reviews of data for accuracy and consistency.

    • Establishing processes for data cleansing and deduplication to enhance operational efficiency.

Reporting and Visualization

  • Understanding the distinctions between reports (which are collections of data presented typically in tables) and dashboards (which are visual representations of key metrics) is critical for effective data communication.

    • Self-service reporting options empower users to access required information independently, enhancing overall productivity and decision-making.

Chapter 7: Conclusion

Integration and Predictive Features

  • Power BI's integration capabilities with Microsoft applications provide enhanced usability and operational efficiency. The ability to connect various data sources allows for streamlined report generation and insights extraction.

  • Predictive analytics play a significant role in enabling organizations to forecast needs—such as inventory stock levels—thereby facilitating informed decision-making and strategic planning.

User Experience and Accessibility

  • A primary focus should be on developing intuitive user interfaces that promote mobile access to data, catering to the modern workforce's dynamic needs.

  • Ensuring consistent layouts for reports and dashboards is essential for improving user familiarity and efficiency, thus enhancing overall user experience with the analytics tools.

Full summary

Chapter 1: Introduction Technical Requirements It is crucial to align new technology initiatives with existing infrastructure to ensure that organizations can reap the full benefits of both the old and new systems. This involves a thorough analysis of current programs and their compatibility with proposed solutions. The challenge lies in integrating new technology efficiently into existing systems without jeopardizing performance, security, or compliance. Data Governance Regulatory Requirements Organizations are obligated to adhere to a variety of laws and regulations regarding data governance. Key regulations include: USA Patriot Act: This act allows for the collection of data for national security purposes and requires organizations to ensure data privacy standards are upheld. EPA Regulations: These regulations govern the environmental impact of data handling and require organizations to maintain specific protocols to minimize ecological harm. GDPR (General Data Protection Regulation): This regulation is comprehensive in its requirements, imposing strict rules on how personal data of EU citizens can be collected, processed, and stored. Crucial aspects include: Organizations must implement appropriate security measures to protect personal data. Individuals have the right to access their data and request its deletion if appropriate. Project Management Methodologies The Moscow Principle is a widely utilized project management framework that helps teams prioritize project requirements to ensure successful delivery. It categorizes requirements into four key areas: Must Have: Critical requirements that are essential for project success. Should Have: Important requirements that add significant value but are not critical. Could Have: Desirable requirements that can enhance the project if time and resources permit. Will Not Have: Requirements that are agreed upon as unnecessary for the current project phase. Collaborating with clients and stakeholders to continuously align on priorities is essential to maintain clarity and scope. This process often involves replacing outdated systems by first reverse engineering to understand the current systems thoroughly. Chapter 2: Need A Report Reverse Engineering Reverse engineering is an analytical process aimed at examining existing reports and data systems to identify components that require replacement or upgrading. This method helps determine: The necessary data elements and technologies for effective report generation. A full audit of current reporting structures to identify inefficiencies or opportunities for enhancement. Client Engagement Engaging with clients is critical to achieving the desired outcomes in report generation. It is necessary to: Confirm with the client their understanding of which components of existing reports should be retained, revised, or replaced in the new reports. Establish clear communication channels to address client expectations and concerns throughout the reporting process. Chapter 3: Warehouse And Data Data Warehouse Functions Data warehouses serve as centralized repositories that store integrated information over extended periods, facilitating analytics and business intelligence. They: Preserve historical data necessary for trend analysis, allowing organizations to observe changes and patterns over time. Create a single source of truth, which is vital for accurate reporting, compliance, and sound decision-making. OLTP vs OLAP Understanding the differences between Online Transactional Processing (OLTP) and Online Analytical Processing (OLAP) systems is fundamental to data management. OLTP: Designed for managing day-to-day transactions, it focuses on real-time processing of data, engaging in numerous short online transactions. OLAP: Tailored for complex queries and in-depth data analysis, OLAP enables users to execute multi-dimensional queries and generate meaningful reports based on historical and aggregated data. Chapter 4: Business And Data Data Sources and Integration The integration of data from various sources, including Enterprise Resource Planning (ERP) systems, is critical for comprehensive reporting and analytics. Data integration processes help to: Eliminate data silos, ensuring a holistic view of the business. Enhance business intelligence capabilities by providing accurate and timely information from a centralized data warehouse. Chapter 5: Master Data Master Data Management Master data management focuses on the primary entities within a business—such as customers, products, and suppliers. Effective management of master data ensures: Consistency and accuracy across all records, preventing data quality issues that can lead to erroneous reports and analysis. A comprehensive understanding of the relationships between different data records, crucial for maintaining data integrity and compliance. Chapter 6: Most Data Management Data Management Objectives Data management objectives should focus on maintaining accurate records, correcting data entry errors, and ensuring high-quality data across all systems. This includes: Regular audits and reviews of data for accuracy and consistency. Establishing processes for data cleansing and deduplication to enhance operational efficiency. Reporting and Visualization Understanding the distinctions between reports (which are collections of data presented typically in tables) and dashboards (which are visual representations of key metrics) is critical for effective data communication. Self-service reporting options empower users to access required information independently, enhancing overall productivity and decision-making. Chapter 7: Conclusion Integration and Predictive Features Power BI's integration capabilities with Microsoft applications provide enhanced usability and operational efficiency. The ability to connect various data sources allows for streamlined report generation and insights extraction. Predictive analytics play a significant role in enabling organizations to forecast needs—such as inventory stock levels—thereby facilitating informed decision-making and strategic planning. User Experience and Accessibility A primary focus should be on developing intuitive user interfaces that promote mobile access to data, catering to the modern workforce's dynamic needs. Ensuring consistent layouts for reports and dashboards is essential for improving user familiarity and efficiency, thus enhancing overall user experience with the analytics tools.

robot