Big Data & Analytics

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/19

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

20 Terms

1
New cards

What is Big Data?

Big Data refers to datasets that are too large, complex, or fast-changing for traditional databases to handle.

2
New cards

What are the 3Vs of Big Data?

  • Volume – Huge amounts of data generated continuously

  • Variety – Different types of data (structured, unstructured, semi-structured)

  • Velocity – The speed at which data is generated and processe

3
New cards

Why is Volume important in Big Data?

  • Refers to the size of data being generated

  • The amount of data produced every second is increasing exponentially

  • Examples: Social media posts, IoT sensors, financial transactions

4
New cards

Why is Variety important in Big Data?

  • Data comes in different formats:

    • Structured (e.g., databases, spreadsheets)

    • Semi-structured (e.g., JSON, XML)

    • Unstructured (e.g., videos, images, emails)

  • Traditional databases struggle to handle all types of data

5
New cards

Why is Velocity important in Big Data?

  • Refers to the speed at which data is generated, stored, processed, and analyzed

  • Real-time data processing is crucial for fraud detection, stock trading, and personalized ads

6
New cards

What are the benefits of Big Data?

  • Uncover hidden patterns in data

  • Make real-time decisions based on fast analysis

  • Gain competitive advantage through data insights

  • Improve business agility by responding quickly to trends

  • data-driven decisions

7
New cards

What is Big Data processing?

  • Uses parallel computing or distributed computing

  • Splits large tasks into smaller ones that run simultaneously

  • Improves performance, speed, and scalability

8
New cards

What is MapReduce?

  • parallel processing framework for handling large datasets

  • Map task: Converts input data into key-value pairs

  • Reduce task: Aggregates results from the map task

9
New cards

What is Hadoop?

  • An open-source Big Data framework by Apache

  • Uses Hadoop Distributed File System (HDFS) for efficient storage

  • Allows clustering multiple computers to analyze massive datasets

10
New cards

What is HDFS (Hadoop Distributed File System)?

  • Splits data into smaller parts and stores three replicas on multiple servers

  • Ensures fault tolerance and reliability

11
New cards

What is Business Analytics (BA)?

the process of developing actionable decisions based on insights generated from data. It examines data using statistical tools and creates descriptive, predictive, and prescriptive models.

12
New cards

What are the three phases of decision-making?

  1. Intelligence phase – Identifying the problem

  2. Design phase – Exploring possible solutions

  3. Choice phase – Selecting and implementing a solution

13
New cards

What is Descriptive Analytics?

Descriptive Analytics summarizes past data to help decision-makers learn from past behaviours.

14
New cards

What is a Data Warehouse?

repository of historical data, stored in a multi-dimensional format (data cubes), used to support decision-making.

15
New cards

What are examples of Descriptive Analytics tools?

  • Online Analytical Processing (OLAP)

  • Decision Support Systems

  • Data Mining

  • Excel (Pivot Tables), Google Analytics, Tableau, Power BI

16
New cards

What is Predictive Analytics?

Predictive Analytics examines historical and recent data to detect patterns and forecast future outcomes based on probabilities.

17
New cards

What are examples of Predictive Analytics applications?

  • Credit card fraud detection

  • Sales forecasting

  • Illness prediction

  • Targeted marketing

18
New cards

What BA tools are used for Predictive Analytics?

  • Data Mining – Extracts patterns from large datasets

  • Regression Analysis – Models relationships between variables to make predictions

19
New cards

What is Prescriptive Analytics?

Prescriptive Analytics recommends one or more courses of action and predicts the outcome of each decision. It focuses on finding the optimal solution.

20
New cards

What BA tools are used for Prescriptive Analytics?

  • Optimization – Finds the best solution given constraints

  • Simulation tools – Models different scenarios to predict outcomes

  • Decision trees – Helps visualize possible decisions and their consequences