Lecturer: Dr. Syed Aun Irtaza
Lecture Number: 01
Topics Covered:
Data Warehousing
Big Data Analytics
Rapid Growth of Data:
Total internet data: 120 zettabytes (ZB) in 2023, projected to reach 181 ZB by 2025.
Daily data creation: Approximately 2.5 quintillion bytes including social media, emails, and transactions.
Sources of Internet Data:
Social Media: Platforms such as Facebook and Twitter generate petabytes of data daily.
Videos: YouTube faces over 500 hours of uploads per minute.
Emails: Over 300 billion emails sent daily.
Search Engines: Google processes more than 8.5 billion searches a day.
Cloud Storage and Data Centers:
Major providers: AWS, Google Cloud, Microsoft Azure.
Largest data centers can host over 100 petabytes of information.
Data Growth Trends:
Influenced by IoT devices, 5G networks, and AI advancements.
Definition:
Massive volumes of data from sensors; e.g., LSST (Large Synoptic Survey Telescope) generates 40TB/day.
Expectation of over 100PB in a decade.
Knowledge-Driven Economy:
Importance of harnessing data effectively.
Consideration of first mover advantages in the tech industry.
Industry Change:
Business must adapt or risk being left behind.
Missed opportunities can hinder growth.
Business Overview:
Airlines fundamentally sell seats but face complexities in operations and profitability.
Revenue Generation:
Major income from passenger services and ticket sales.
Pricing strategies include dynamic pricing based on the demand.
Ancillary revenues such as baggage fees and loyalty programs.
Cargo transport as an additional revenue source.
Expenses Breakdown:
Fixed Costs: Aircraft leasing, crew salaries, and maintenance.
Variable Costs: Fuel, airport fees, and services.
Uncontrollable Costs: Impacted by fuel prices, regulations, and exchange rates.
Challenges in the Industry:
High operational costs and thin profit margins.
Market volatility and intense competition.
Regulatory and environmental challenges.
Disruptions:
Weather, cybersecurity issues, and pandemics.
Changing customer expectations and workforce management.
Strategies for Improvement:
Focus on cost optimization and revenue diversification.
Banking Sector Insights:
Many customers often turn out unprofitable despite overall profit.
Importance of transactional behavior analysis.
Need for product restructuring for effective profitability analysis over time.
Hacked Credit Card Patterns:
Deviations from typical purchasing habits signal potential fraud.
Cities notorious for fraud and unusual items associated with stolen cards.
Develop understanding of RDBMS concepts and their application in decision support systems.
Analyze differences between RDBMS and Data Warehouse.
Learn big data analysis and emerging technologies in the field.
Topics include:
Introduction to Data Warehousing
RDBMS Basics and SQL
Python and ETL processes
Data Mining and Machine Learning topics
Information Visualization Techniques
"I cannot teach anything to anyone, I can just make them think." - Socrates