Overview of Cloud and Big Data presentation
Presenter: Ivika Jäger
Date: November 3, 2024
Definition: A data warehouse accessed over the internet.
Private Cloud:
Dedicated to a single organization.
Maximum security.
High cost.
Public Cloud:
Shared services offered to the general public.
Affordable and accessible.
Hybrid Cloud:
Combines private and public cloud elements.
Sensitive data stored on private servers; public platforms used for analytics.
Traditional data warehousing (DW) accessed via network connection.
A network can be closed and not connected to the internet.
The Cloud is always accessed over the internet.
Definition: Computing services delivered over the internet.
Resources: Storage, servers, networking, databases, and software.
Model: Businesses and individuals can rent the resources they need (pay-as-you-go model).
Example: Amazon Web Services (AWS).
Feature | Traditional Data Warehousing | Cloud Data Warehousing |
Location | On-premises, within the organization's data centers | Hosted on cloud platforms (e.g., AWS Redshift, Snowflake) |
Control | Fully controlled by the organization | Managed by cloud provider, reducing admin overhead |
Scalability | Limited by hardware investments | Easily scalable on-demand |
Costs | High upfront costs for infrastructure and maintenance | Pay-as-you-go model |
Access | Limited to internal networks | Accessible globally via the internet |
Regulations and Compliance:
Laws may require sensitive information to be stored on-premises.
High Sensitivity:
Proprietary data is safer under in-house control.
Risk Factors:
Data breaches, insider threats, dependency on third-party providers.
Performance Requirements:
Traditional DW systems may outperform cloud-based systems.
Infrastructure as a Service (IaaS):
Renting land; complete control of everything on the land.
Platform as a Service (PaaS):
Renting a workshop; tools and space are pre-configured.
Software as a Service (SaaS):
Renting a fully furnished hotel room where everything is ready to use.
Feature | IaaS | PaaS | SaaS |
What's Provided | Virtual machines, storage, network | Development tools, frameworks | Fully functional applications hosted by the provider |
User Control | Full control over OS and software | Focus on application development | Limited to using application's features |
Management | Hardware, virtualization | Hardware, OS, middleware | Everything (hardware, software, updates) |
Best For | IT teams managing environments | Developers creating apps quickly | End-users needing ready-to-use applications |
Use Case:
Useful for startups needing scalable server capacity.
Example: Netflix on AWS: rents servers and optimizes capacity.
Case Study:
Linas Matkasse builds on Microsoft Azure App Services.
Utilizes Azure products for various services including app services and analytics.
Popular SaaS Applications:
iCloud, Microsoft 365, Zoom, Dropbox, Salesforce, Google Docs.
Cloud-based, subscription model
Centralized updates; no installation required
Examples: Gmail, ChatGPT, Microsoft 365, Dropbox.
Benefits:
Accessibility: available anytime, anywhere.
Scalability: scale resources based on needs.
Cost-efficiency: no upfront investment, pay-as-you-go.
Inclusivity: affordable AI, big data, and analytics for small businesses.
Provides advanced analytics capabilities without infrastructure investment.
Examples: Amazon Quicksight, Tableau Cloud, PowerBI (Cloud version).
Unit Measurements:
From Kilobyte (KB) to Yottabyte (YB) in data storage.
Data types and sizes vary considerably, emphasizing the sheer volume of big data.
Yearly data creation visualization across regions.
Volume: Rapid data accumulation.
Variety: Types of data: structured, semi-structured, unstructured.
Velocity: Speed of data creation and processing.
Traditional methods struggle with big data due to the need for real-time processing.
Tools: Hadoop, Apache Spark for distributed storage and analytics.
Insights into consumer behavior, retail trends, agriculture, and crime analysis.
Analyzing location data for market trends and customer experiences.
Usage of remote sensing and data for monitoring and analysis.
Analyzing user interactions to inform business strategies.
Processing large volumes of financial transactions for trend forecasting and risk management.
Devices tracking various physical phenomena for monitoring and analysis.
IoT creates vast amounts of data through connected devices and sensors.
Components include sensors, connectivity, and software for data management.
Integrating automation through sensors for everyday activities.
Enhancing efficiency and decision-making across various sectors.
Complexity of data analysis and security concerns.
Accelerating IoT capabilities through improved connectivity.
Combining smart sensors with AI for enhanced analysis.
Cloud computing enables scalable storage and advanced data analytics through AI.
Outlook: A future of cloud-based, AI-driven data management and decision-making.