CHAPTER 6 ITM


Analytics

The extensive use of data and quantitative analysis to support fact-based decision making within organizations.



big data

The term is used to describe data collections that are so enormous (terabytes or more) and complex (from sensor data to social media data) that traditional data management software, hardware, and analysis processes are incapable of dealing with them.



business intelligence

A wide range of applications, practices, and technologies for the extraction, transformation, integration, visualization, analysis, interpretation, and presentation of data to support improved decision making.



conversion funnel

A graphical representation that summarizes the steps a consumer takes in making the decision to buy your product and become a customer.



Cross-Industry Process for Data Mining (CRISP-DM)

A six-phase structured approach for the planning and execution of a data mining project.



data lake

A “store everything” approach to big data that saves all the data in its raw and unaltered form.



data mart

A subset of a data warehouse that is used by small- and medium-sized businesses and departments within large companies to support decision making.



data mining

A BI analytics tool used to explore large amounts of data for hidden patterns to predict future trends and behaviors for use in decision making.





data scientist

An individual who combines strong business acumen, a deep understanding of analytics, and a healthy appreciation of the limitations of data, tools, and techniques to deliver real improvements in decision making.



data warehouse

A large database that holds business information from many sources in the enterprise, covering all aspects of the company’s processes, products, and customers.



descriptive analysis

A preliminary data processing stage used to identify patterns in the data and answer questions about who, what, where, when, and to what extent.



Extract Load Transform (ETL) process

A data handling process that takes data from a variety of sources, edits and transforms it into the format used in the data warehouse, and then loads this data into the warehouse.



genetic algorithm

An approach to solving problems based on the theory of evolution; uses the concept of survival of the fittest to find approximate solutions to optimization and search problems.



Hadoop

An open-source software framework including several software modules that provide a means for storing and processing extremely large data sets.




Hadoop Distributed File System (HDFS)

A system used for data storage that divides the data into subsets and distributes the subsets onto different servers for processing.



in-memory database (IMDB)

A database management system that stores the entire database in random access memory (RAM).





linear programming

A technique for finding the optimum value (largest or smallest, depending on the problem) of a linear expression (called the objective function) that is calculated based on the value of a set of decision variables that are subject to a set of constraints.



MapReduce program

A composite program that consists of a Map procedure that performs filtering and sorting and a Reduce method that performs a summary operation.



Monte Carlo simulation

A simulation that enables you to see a spectrum of thousands of possible outcomes, considering not only the many variables involved, but also the range of potential values for each of those variables.



NoSQL database

A way to store and retrieve data that is modeled using some means other than the simple two-dimensional tabular relations used in relational databases.



predictive analytics

A set of techniques used to analyze current data to identify future probabilities and trends, as well make predictions about the future.



regression analysis

A method for determining the relationship between a dependent variable and one or more independent variables.



scenario analysis

A process for predicting future values based on certain potential events.



self-service analytics

Training, techniques, and processes that empower end users to work independently to access data from approved sources to perform their own analyses using an endorsed set of tools.



text analysis

A process for extracting value from large quantities of unstructured text data.



time series analysis

The use of statistical methods to analyze time series data and determine useful statistics and characteristics about the data.



video analysis

The process of obtaining information or insights from video footage.



visual analytics

The presentation of data in a pictorial or graphical format.



word cloud

A visual depiction of a set of words that have been grouped together because of the frequency of their occurrence.

robot