07 - Analytics

Introduction to Analytics and Data Mining

  • Teacher: Ivika Jäger, Mittuniversitetet, November 2024

Types of Analytics

  • Descriptive Analytics: Analyzes historical data to understand patterns and trends.

  • Predictive Analytics: Uses historical data and statistical algorithms to anticipate future outcomes.

  • Prescriptive Analytics: Provides recommendations for actions based on data analysis.

Predictive Analytics

  • Purpose: To predict future trends and behaviors.

  • Methods Used:

    • Regression analysis

    • Time series analysis

    • Machine learning algorithms

    • Classification models

    • Data mining

Examples of Analytics in Practice

  • Retail Company: Uses descriptive analytics to evaluate past sales data and customer behavior to inform new product strategies.

  • E-commerce Company: Applies predictive analytics to forecast product demand for holiday seasons based on historical sales data.

  • Logistics Company: Utilizes prescriptive analytics to determine optimal delivery routes by analyzing historical and real-time data.

Data Mining Overview

  • Definition: The process of extracting knowledge from large datasets.

  • Key Techniques:

    • Prediction: Forecasting future events, for example, sales forecasting.

    • Classification: Assigning data to predefined categories.

    • Clustering: Grouping data points without predefined labels.

    • Association: Identifying relationships between variables, e.g., co-purchase behavior.

Data Mining vs. Statistics

  • Statistics: Starts with a hypothesis, tests with sample data (e.g., observing ice cream sales based on weather).

  • Data Mining: Explores all data to find patterns (e.g., identifying higher sales on weekends).

Common Myths about Data Mining

  • Myth: Data mining provides immediate clear predictions.

    • Reality: Requires domain knowledge and time.

  • Myth: An advanced degree is necessary.

    • Reality: There are accessible tools available for anyone.

  • Myth: Only large firms can utilize data mining.

Data Mining Process

  1. Business Understanding: Define objectives and problems.

  2. Data Understanding: Evaluate data quality.

  3. Data Preparation: Clean and preprocess data.

  4. Model Building: Utilize algorithms to find patterns.

  5. Testing and Evaluation: Check model performance.

  6. Deployment: Implement insights within business dynamics.

Text Mining

  • Purpose: Converts unstructured text into structured data.

  • Challenges:

    • Correctly tagging words (nouns vs. verbs).

    • Language ambiguity.

  • Solution: Modern AI tools like LLMs enhance contextual understanding.

Web Mining

  • Purpose: Analyzes web content, structure, and usage.

  • Components:

    • Web Content: Extracting information from web pages.

    • Web Structure: Understanding website links.

    • Web Usage: User behavior analysis.

  • SEO: Strategies to increase website visibility through keywords, tags, backlinks.

Deep Learning vs. Machine Learning

  • Machine Learning (ML): Requires feature definitions.

  • Deep Learning (DL): Automatically learns features, reducing manual work.

  • Comparison: DL handles complex data representations more efficiently than traditional ML.

Features in Deep Learning

  • Definition: Features (columns in datasets) are explanatory variables.

  • Role of Neurons: Process features, not observations.

Challenges in Deep Learning

  • Requires advanced hardware (e.g., GPUs).

  • Needs large, high-quality datasets.

  • Manual data labeling is time and cost-intensive.

  • Overcoming these challenges enhances predictive capabilities.

Deep Neural Networks

  • Evolution: Early networks had few layers; modern networks can have millions of neurons.

  • Input Data: Processes multidimensional inputs (e.g., image pixels).

Types of Neural Networks

  • Multilayer Perceptron (MLP): Simple feedforward network for basic tasks.

  • Recurrent Neural Network (RNN): Retains feedback and memory for contextual learning.

  • Long Short-Term Memory (LSTM): Type of RNN for memory efficiency.

Iterative Optimization in Deep Learning

  • Models fine-tune weights iteratively to reduce prediction errors.

  • Requires repeated evaluations of input-output relations to improve model performance.

AI as a Service (AIaaS)

  • Offers pre-configured AI solutions via cloud providers.

  • Streamlines routine tasks, allowing businesses to focus on innovation.

ChatGPT Data Analysis Capabilities

  • Ability to upload and analyze data files (e.g., CSV).

  • Can interpret images, take real-time screenshots, and engage in interactive tasks.

Practical Analysis of Coffee Sales Data

  • Dataset Overview: Comprises 2,341 entries; includes date, time, payment type, amount, and coffee type.

  • Analysis Focus: Identifying trends in coffee purchases, payment methods, and customer preferences.

Conclusion

  • The notes summarize various aspects of analytics, data mining, and the applications of AI in business decision-making and enhancing data interpretation.

robot