Big Data, Analytics, BI, and Data Concepts (Lecture Notes)

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/23

flashcard set

Earn XP

Description and Tags

Flashcards cover Big Data rationale, data sources, data types, analytics levels ( Descriptive, Diagnostic, Predictive, Prescriptive ), BI, KPIs, and the Vs of Big Data as described in the lecture notes.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

24 Terms

1
New cards

What is Big Data?

Extremely large and complex data sets that exceed the capabilities of traditional data processing tools.

2
New cards

Why are traditional data tools not enough for Big Data?

They crash or are too slow when handling huge, varied, and continuously arriving data; require specialized Big Data tools and methods.

3
New cards

From what sources does Big Data typically come?

Internal company apps, sensors (e.g., in smart devices), and outside sources (like weather data).

4
New cards

What is a data warehouse?

A large storage place for organized company data used to feed applications and analyses.

5
New cards

Name some benefits of Big Data.

Smoother operations, smart actionable insights, new markets for products, more accurate predictions, fraud detection, detailed records, and better decision-making.

6
New cards

What is data provenance (pedigree)?

The origin and history of data, including what happened to it during processing.

7
New cards

What is metadata?

Data about data; information describing data such as origin, context, content, structure, and history.

8
New cards

What is the difference between datum and data?

Datum is a single piece of information; data is a collection of many pieces.

9
New cards

What is data analysis?

Careful examination of data to discover facts, connections, patterns, hidden insights, and trends to help decision-making.

10
New cards

What is data analytics?

A broader field than data analysis that covers the entire data lifecycle, governance, and development of methods and tools for analysis.

11
New cards

What are the stages of the data lifecycle?

Collecting data, cleaning it (removing errors), organizing it, storing it, analyzing it, and governing it.

12
New cards

What are Descriptive Analytics?

Describes what happened in the past by summarizing data (e.g., last year's sales).

13
New cards

What are Diagnostic Analytics?

Tries to explain why something happened by digging into the causes of past events.

14
New cards

What are Predictive Analytics?

Tries to predict future outcomes based on past patterns and trends.

15
New cards

What are Prescriptive Analytics?

Builds on predictions to suggest specific actions and the reasons for them.

16
New cards

What is Business Intelligence (BI)?

A system that helps a company understand its performance across activities, typically using a data warehouse.

17
New cards

What is a KPI?

A Key Performance Indicator; a metric that shows whether a business objective is being met, often shown in BI dashboards.

18
New cards

What are the five Big Data Vs mentioned in the notes?

Volume, Velocity, Variety, Veracity, and Value.

19
New cards

What does Volume mean in Big Data?

A huge amount of data requiring special storage and processing solutions.

20
New cards

What does Velocity mean in Big Data?

Data arriving very fast, requiring quick or real-time processing.

21
New cards

What does Variety mean in Big Data?

Different formats and types of data (text, images, videos, numbers) to be integrated.

22
New cards

What does Veracity mean in Big Data?

Data quality and trustworthiness; distinguishing signal from noise and managing data from controlled vs. uncontrolled sources.

23
New cards

What does Value mean in Big Data?

The usefulness of data; how quickly it can be turned into insights, aided by good storage and clean, well-communicated results.

24
New cards

What are the main data types described in the notes?

Structured (organized in tables), Unstructured (free-form like text/images; ~80% of data; hard to process), Semi-structured (partially organized, e.g., XML/JSON).