1/19
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is Big Data?
Big Data refers to datasets that are too large, complex, or fast-changing for traditional databases to handle.
What are the 3Vs of Big Data?
Volume – Huge amounts of data generated continuously
Variety – Different types of data (structured, unstructured, semi-structured)
Velocity – The speed at which data is generated and processe
Why is Volume important in Big Data?
Refers to the size of data being generated
The amount of data produced every second is increasing exponentially
Examples: Social media posts, IoT sensors, financial transactions
Why is Variety important in Big Data?
Data comes in different formats:
Structured (e.g., databases, spreadsheets)
Semi-structured (e.g., JSON, XML)
Unstructured (e.g., videos, images, emails)
Traditional databases struggle to handle all types of data
Why is Velocity important in Big Data?
Refers to the speed at which data is generated, stored, processed, and analyzed
Real-time data processing is crucial for fraud detection, stock trading, and personalized ads
What are the benefits of Big Data?
Uncover hidden patterns in data
Make real-time decisions based on fast analysis
Gain competitive advantage through data insights
Improve business agility by responding quickly to trends
data-driven decisions
What is Big Data processing?
Uses parallel computing or distributed computing
Splits large tasks into smaller ones that run simultaneously
Improves performance, speed, and scalability
What is MapReduce?
A parallel processing framework for handling large datasets
Map task: Converts input data into key-value pairs
Reduce task: Aggregates results from the map task
What is Hadoop?
An open-source Big Data framework by Apache
Uses Hadoop Distributed File System (HDFS) for efficient storage
Allows clustering multiple computers to analyze massive datasets
What is HDFS (Hadoop Distributed File System)?
Splits data into smaller parts and stores three replicas on multiple servers
Ensures fault tolerance and reliability
What is Business Analytics (BA)?
the process of developing actionable decisions based on insights generated from data. It examines data using statistical tools and creates descriptive, predictive, and prescriptive models.
What are the three phases of decision-making?
Intelligence phase – Identifying the problem
Design phase – Exploring possible solutions
Choice phase – Selecting and implementing a solution
What is Descriptive Analytics?
Descriptive Analytics summarizes past data to help decision-makers learn from past behaviours.
What is a Data Warehouse?
A repository of historical data, stored in a multi-dimensional format (data cubes), used to support decision-making.
What are examples of Descriptive Analytics tools?
Online Analytical Processing (OLAP)
Decision Support Systems
Data Mining
Excel (Pivot Tables), Google Analytics, Tableau, Power BI
What is Predictive Analytics?
Predictive Analytics examines historical and recent data to detect patterns and forecast future outcomes based on probabilities.
What are examples of Predictive Analytics applications?
Credit card fraud detection
Sales forecasting
Illness prediction
Targeted marketing
What BA tools are used for Predictive Analytics?
Data Mining – Extracts patterns from large datasets
Regression Analysis – Models relationships between variables to make predictions
What is Prescriptive Analytics?
Prescriptive Analytics recommends one or more courses of action and predicts the outcome of each decision. It focuses on finding the optimal solution.
What BA tools are used for Prescriptive Analytics?
Optimization – Finds the best solution given constraints
Simulation tools – Models different scenarios to predict outcomes
Decision trees – Helps visualize possible decisions and their consequences