Big Data
The collection and analysis of data sets so large and complex that traditional methods typically brought to bear on the problem would be overwhelmed
Business analytics
The process of using statistical analysis and modeling to drive business decisions.
Case
A case is an individual about whom or which we have data. Also called a record or row.
Categorical (or qualitative) variable
A variable that names categories (whether with words or numerals)
Context
Ideally tells who was measured, what was measured, how the data were collected, where the data were collected, and when and why the study was performed
Cross-sectional data
Data taken from situations that vary over time but measured at a single time instant are said to be a cross-section of the time series
Data
Recorded values, whether numbers or labels, together with their context
Data mining (or predictive analytics)
The process of using a variety of statistical tools to analyze large databases or data warehouses
Data table
An arrangement of data in which each row represents a case and each column represents a variable
Data warehouse
A large database of information collected by a company or other organization usually to record transactions that the organization makes, but also used for analysis via data mining.
Experimental unit
An individual in a study for which or for whom data values are recorded. Human experimental units usually called subjects or participants.
Identifier variable
A categorical variable that records a unique value for each case, used to name or identify it
Metadata
Auxiliary information about variables in a database, typically including how, when, and where (and possibly why) the data were collected; who each case represents; and the definitions of all the variables
Nominal variable
Can be applied to a variable whose values are used only to name categories
Ordinal variable
Can be applied to a variable whose categorical values possess some kind of order
Participant
A human experimental unit. Also called a subject
Quantitative variable
A variable in which the numbers are values of measured quantities with units
Records
Information about an individual in a database.
Relational database
Stores and retrieves information. Information is kept in data tables that can be “related” to each other.
Respondent
Someone who answers, or responds to, a survey
Spreadsheet
A layout designed for accounting that is often used to store and manage data tables. Excel is a common example.
Subject
A human experimental unit. Also called a participant
Time series
Data measured over time. Usually the time intervals are equally spaced or regularly spaced (e.g., every week, every quarter, or every year).
Units
A quantity or amount adopted as standard of measurement, such as dallars, hours, or grams
Variable
Hold information about the same characteristic for many cases.