1/66
A set of vocabulary-style flashcards covering key concepts from the lecture notes on data, analytics, quality, databases, time series, visualization, and SQL.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Data
Raw facts and figures used by computers to support decision making (input before processing).
Information
Processed data that informs decisions and actions.
Big Data
Extremely large and diverse data sets that require special methods to analyze.
6 V of Big Data
The six characteristics: Volume, Variety, Velocity, Veracity, Value, and Variability.
Volume
The amount of data available.
Variety
Different data types, including structured, semi-structured and unstructured data.
Velocity
The speed at which data is generated and processed.
Veracity
The degree of trustworthiness or reliability of data.
Value
Business value derived from data.
Variability
The ways data can be used or vary in interpretation.
Data analytics
The science of analyzing raw data and turning it into useful information.
Data analysis
A subset of data analytics focused on evaluating data to gain insights.
Descriptive analytics
Analyses that describe what happened in the data.
Diagnostic analytics
Analyses that explain why something happened.
Predictive analytics
Analyses that forecast what will happen in the future.
Prescriptive analytics
Analyses that suggest actions to make something happen.
Data culture
An organizational approach where decisions are data-informed, data-driven, and data-inspired.
Data-informed
Decisions guided by data insights.
Data-driven
Decisions primarily guided by data evidence.
Data-inspired
Decisions influenced by data but not solely determined by it.
Cognitive bias
Systematic errors in thinking that data helps reduce or counteract.
Competitive advantage
Gaining an edge over competitors through effective use of data.
Data literacy gap
96% of executives discount data they don't understand.
Problem Statement
Steps to frame business problems for data goals: understand situation, define goals, translate to data goals, frame problem, build metrics.
Data Wrangling
Cleaning, structuring, and enriching data for analysis.
Data quality
How good data is for use, including dimensions like accuracy, completeness, validity, consistency, integrity, and uniqueness.
Uniqueness
No duplicates; each value is distinct where required.
Accuracy
Data correctly reflects the real world or source.
Completeness
All required data is present.
Validity
Data conforms to defined formats and business rules.
Consistency
Data remains uniform across datasets and systems.
Integrity
Data relationships and structure are maintained accurately.
Data discovery
Understanding what the data says and implies.
Data structuring
Transforming data into a standard tabular format with variables (columns) and observations (rows).
Dataset
A collection of values organized for analysis.
Variable
A column representing a measured attribute.
Observation
A row representing measurements on a unit across attributes.
Relational database
A database organized into related tables with predefined relationships.
Database schema
Blueprint defining how data is organized and related in a relational database.
Time series data
Data points indexed in time order (time-stamped data).
Trend
General direction in data over time.
Cyclic variations
Regular cycles in data beyond seasonal patterns.
Seasonal variations
Regular patterns repeating within a fixed period (e.g., yearly).
Random movements
Unpredictable, irregular fluctuations in data.
Excel data types
Common data types in Excel including text, numbers, dates, times, boolean, and errors.
Text
Alphanumeric strings stored in cells.
Number
Numeric values (integers or decimals).
Date
Calendar date values.
Time
Time values (clock time).
Boolean
TRUE (1) or FALSE (0) values.
Error
Spreadsheet error values like #DIV/0, #N/A, etc.
SQL data types
Common SQL data types such as TEXT, CHAR, VARCHAR, NUMBER, FLOAT, INTEGER, BOOLEAN, DATE, TIME, DATETIME, TIMESTAMP, INTERVAL, NULL.
TEXT
Text data type in SQL for large strings.
CHAR
Fixed-length character data type in SQL.
VARCHAR
Variable-length character data type in SQL.
NUMBER
Numeric data type in some SQL dialects.
FLOAT
Floating-point numeric data type.
INTEGER
Whole-number numeric data type.
BOOLEAN
TRUE (1) or FALSE (0) data type.
DATE
Date value without time.
TIME
Time value without date.
DATETIME
Date and time value.
TIMESTAMP
Date and time value with possible timezone awareness.
INTERVAL
A duration of time in SQL.
NULL
A special value representing missing or unknown data.
Data Visualization
The representation of data through graphics like charts and infographics to communicate insights.
McCandless method
A framework by David McCandless for good data visualization focusing on data, story, goal, and visual form.