1/45
A set of vocabulary flashcards covering core concepts, terms and definitions from UCS551 Chapters 1 & 2 on Introduction to Data Analytics and Data Understanding.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Data
A set of values relating to qualitative or quantitative variables; becomes information when interpreted in context.
Data Analytics
The process of inspecting, cleansing, transforming and modeling data to discover useful information, support conclusions and aid decision-making.
Big Data
Data that is expensive to manage and hard to extract value from due to its large Volume, Velocity, Variety, and related characteristics.
Volume (Big Data ‘V’)
The sheer size or amount of data that defines it as “big.”
Velocity (Big Data ‘V’)
The speed at which data is generated, processed and accessed.
Variety (Big Data ‘V’)
The diversity of data sources, formats, qualities and structures.
Variability (Big Data ‘V’)
The way data constantly changes, requiring interpretation of shifting meanings.
Veracity (Big Data ‘V’)
The accuracy, reliability and quality of data gathered.
Visualization (Big Data ‘V’)
The presentation of data in visual form to support managerial decision-making.
Value (Big Data ‘V’)
The benefit an organization gains after investing effort in the other V’s of big data.
Structured Data
Information organized in a predefined model that is easily searchable, e.g., relational databases and spreadsheets.
Unstructured Data
Information lacking a predefined structure, making it time-consuming to search using traditional methods, e.g., emails, images, documents.
Semi-structured Data
Data that does not reside in a relational database but has some organizational properties, e.g., XML or JSON.
Streaming Data
Data that arrives continuously and must be processed in near real time.
Relational Data
Structured data stored in tables with rows and columns, often managed by SQL databases.
Vector
A one-dimensional array storing elements of the same type.
Array
A collection of elements identified by index or index tuple; can be one-dimensional or multidimensional.
Matrix
A two-dimensional array of numbers arranged in rows and columns.
Data Frame
A tabular data structure combining multiple vectors as columns where each column is homogeneous and rows represent observations.
List (Data Structure)
An ordered collection of elements that can be of different data types.
Factor (Data Structure)
A data type in statistical software used to handle categorical variables and their levels.
Nominal Level of Measurement
Data categorized without any intrinsic order (e.g., gender, race).
Ordinal Level of Measurement
Categorical data with a meaningful order but unequal intervals (e.g., satisfaction ratings).
Interval Level of Measurement
Numeric data with equal intervals but no true zero (e.g., temperature in °F).
Ratio Level of Measurement
Numeric data with equal intervals and a true zero, allowing ratios (e.g., weight, sales).
Univariate Data
A data set that consists of a single variable, often analyzed with a vector.
Multivariate Data
A data set containing multiple variables, commonly stored in a matrix or data frame.
Descriptive Analytics
Analytics that explain what has happened over a given period.
Diagnostic Analytics
Analytics that investigate why something happened, using diverse data inputs and hypotheses.
Predictive Analytics
Analytics that forecast what is likely to happen in the near future.
Prescriptive Analytics
Analytics that recommend actions based on predictive insights.
Customer Analytics
Use of analytics to model and understand customer behavior and loyalty.
Credit Risk Analytics
Analytical techniques applied to credit data to assess and manage risk.
Retail Analytics
Analytics used to forecast demand and optimize retail operations.
Marketing Analytics
Analytics that evaluate product, pricing, promotion and distribution strategies.
Business Analytics (Churn)
Analyzing business data to identify customers likely to stop using a service and inform retention strategies.
Data Science
A multidisciplinary field that extracts insights from data using mathematics, statistics, AI and computer engineering.
Data Analytics Process – Ask
Define the problem and formulate questions to guide analysis.
Data Analytics Process – Prepare
Collect, combine and store relevant data for analysis.
Data Analytics Process – Process
Clean and verify data to ensure quality and readiness.
Data Analytics Process – Analyze
Search for patterns, relationships and trends within the data.
Data Analytics Process – Share
Communicate findings to stakeholders using reports and visualizations.
Data Analytics Process – Act
Implement decisions and actions based on analytical insights.
Central Tendency
Measures that describe the center of a data set: mean, median, mode.
Dispersion
Measures that describe the spread of data: range, variance, standard deviation.
Data Exploration
Initial analysis phase aimed at understanding data distributions, key attributes, correlations, outliers and quality.