1/23
Flashcards cover Big Data rationale, data sources, data types, analytics levels ( Descriptive, Diagnostic, Predictive, Prescriptive ), BI, KPIs, and the Vs of Big Data as described in the lecture notes.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is Big Data?
Extremely large and complex data sets that exceed the capabilities of traditional data processing tools.
Why are traditional data tools not enough for Big Data?
They crash or are too slow when handling huge, varied, and continuously arriving data; require specialized Big Data tools and methods.
From what sources does Big Data typically come?
Internal company apps, sensors (e.g., in smart devices), and outside sources (like weather data).
What is a data warehouse?
A large storage place for organized company data used to feed applications and analyses.
Name some benefits of Big Data.
Smoother operations, smart actionable insights, new markets for products, more accurate predictions, fraud detection, detailed records, and better decision-making.
What is data provenance (pedigree)?
The origin and history of data, including what happened to it during processing.
What is metadata?
Data about data; information describing data such as origin, context, content, structure, and history.
What is the difference between datum and data?
Datum is a single piece of information; data is a collection of many pieces.
What is data analysis?
Careful examination of data to discover facts, connections, patterns, hidden insights, and trends to help decision-making.
What is data analytics?
A broader field than data analysis that covers the entire data lifecycle, governance, and development of methods and tools for analysis.
What are the stages of the data lifecycle?
Collecting data, cleaning it (removing errors), organizing it, storing it, analyzing it, and governing it.
What are Descriptive Analytics?
Describes what happened in the past by summarizing data (e.g., last year's sales).
What are Diagnostic Analytics?
Tries to explain why something happened by digging into the causes of past events.
What are Predictive Analytics?
Tries to predict future outcomes based on past patterns and trends.
What are Prescriptive Analytics?
Builds on predictions to suggest specific actions and the reasons for them.
What is Business Intelligence (BI)?
A system that helps a company understand its performance across activities, typically using a data warehouse.
What is a KPI?
A Key Performance Indicator; a metric that shows whether a business objective is being met, often shown in BI dashboards.
What are the five Big Data Vs mentioned in the notes?
Volume, Velocity, Variety, Veracity, and Value.
What does Volume mean in Big Data?
A huge amount of data requiring special storage and processing solutions.
What does Velocity mean in Big Data?
Data arriving very fast, requiring quick or real-time processing.
What does Variety mean in Big Data?
Different formats and types of data (text, images, videos, numbers) to be integrated.
What does Veracity mean in Big Data?
Data quality and trustworthiness; distinguishing signal from noise and managing data from controlled vs. uncontrolled sources.
What does Value mean in Big Data?
The usefulness of data; how quickly it can be turned into insights, aided by good storage and clean, well-communicated results.
What are the main data types described in the notes?
Structured (organized in tables), Unstructured (free-form like text/images; ~80% of data; hard to process), Semi-structured (partially organized, e.g., XML/JSON).