Decision Support Systems

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/91

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

92 Terms

1
New cards

Online Transaction Process (OLTP)

Type of computer processing where the computer responds immediately to user requests asnd focuses on data capute

  • Transaction databases such as ATM, ERP, SCM, CRM, …

  • Main focus is on efficiency of routine tasks

  • Data capture

2
New cards

Online Analytical Processing (OLAP)

Processing for end-user ad hoc reports, queries, and analysis used for decision support

  • Data Warehouses or Data Marts

  • Main focus is converting data into information for decision support (queries)

3
New cards

Business Intelligence

An umbrella term that combines architectures, tools, databases, analytical tools, applications, and methodologies

  1. Enables interactive access to data

  2. Provide business managers with the ability to conduct appropriate analysis

4
New cards

Critical BI System Considerations

  • Developing or Acquring BI Systems

    • Make versus Buy

    • BI Shells

  • Justification and Cost-Benefit Analysis

  • Security and Protection of Privacy

  • Integration to Other Systems and Applications

5
New cards

Analytics

The process of developing actionable decisions or recommendations for actions based on insights generated from historical data

  • Combination of technology, science, and statistics to solve problems

6
New cards

Descriptive Analytics

Refers to knowing what is happening in the organization and understanding some underlying trends and causes of such occurrences

  • Answering the question of what happened

  • Analysis of historical data

  • Enablers

    • DW

    • Data Visualization

      • Dashboards and Scorecards

7
New cards

Predictive Analytics

Aims to determine what is likely to happen in the future

  • Used to forecast whether customers are likely to switch to a competitor, what customers are likely to buy, how likely customers will respond to a promotion

  • Looking at the past to determine the future

  • Enablers

    • Data Mining

    • Text Mining / Web Mining

    • Forecasting (i.e. time-series)

8
New cards

Prescriptive Analytics

Aims to determine the best possible solution

  • To identify decisions or actions that will optimize the performance of a system

  • Used to set prices, create production plans, and identify the best locations for facilities

  • Uses both descriptive and predictive to create the alternatives, and determines the best one

  • Enablers

    • Optimization

    • Simulation

    • Multi-Criteria Decision Modeling

9
New cards

Big Data Analytics

Data that cannot be stored or processed easily using traditional tools/means

  • Data that comes in different forms: structured, unstructured, large, continuous, etc

  • Major sourced include clickstreams from Web Sites, postings on social media, and data from traffic, sensors, and weather

  • Is worthless if it does not provide business value

10
New cards

Data

Collection of facts usually obtained as the result of experiments, observations, transactions, or experiences

11
New cards

The Nature of Data

  • Is the main ingredient in all forms of analytics

  • Usually obtained as a result of experiences, observations, and experiments

  • It may consist of numbers, words, images, etc.

  • Lowest level of abstraction (from which information and knowledge are derived)

  • Has to be carefully created/identified, collected, integrated, cleaned, transformed

  • Data quality and data integrity → Critical to analysis

12
New cards

Metrics for Analytics Ready Data

  • Source reliability

  • Content accuracy

  • Accessibility

  • Security and privacy

  • Richness

  • Currency/timeliness (up to date)

  • Validity and Relevancy

13
New cards

Structured Data

Organized information in a fixed format, easily searchable and analyzed, typically stored in databases

  • Targeted for computers to process

14
New cards

Unstructured/Textual Data

Data not organized in a pre-defined manner, often textual and challenging to analyze.

  • Targeted for humans to process

  • Need to be converted into some form of categorical or numeric representation

15
New cards

Semi-Structured Data

Data that does not have a fixed format but contains tags or markers to separate elements.

  • XML, HTML, Log files, etc.

16
New cards

Steps for readying data for analytics

  1. Data Consolidation: Get data together

  2. Data Cleaning: Dealing with missing values

  3. Data Transformation: Formatting

  4. Data reduction

    a. Variables (dimensional reduction, variable selection)

    b. Cases / Samples (Sampling, Balancing)

17
New cards

Statistics

A collection of mathematical techniques to characterize and interpret data

18
New cards

Descriptive Statistics

Describing the data (as is is) (mean, median, mode, etc.)

19
New cards

Inferential Statistics

Drawing inferences about the population based on the sample data (Regression…)

20
New cards

Dispersion

Measure of data spread around a central point.

  • If it is large, the mean is not a good representation of the data because there are larger differences between individual scores

21
New cards

Kurtosis

Nature of the distribution (Peak, tall, skinny, etc.)

22
New cards

Regression

  • Part of inferential statistics

  • Most widely known and used analytics technique in statistics

  • Used to characterize relationships between explanatory (input) and response (output) variables

  • Can be used for

    • Hypothesis testing (explanation)

    • Forecasting (prediction)

23
New cards

Correlation vs. Regression

  1. Correlation makes a no priori assumption of whether one variable is dependent on the other(s)

  2. Gives an estimate on the degree of association between the variables

  3. Regression simplicity assumes that there is a one-way effect from the explanatory variable(s) to the response variable

24
New cards

Logistic Regression

A statistical method used to model relationships between a binary dependent variable and one or more independent variables. It estimates the probability of the outcome.

  • Ex] Will the student pass the class? → Yes/No

25
New cards

Business Report

A written document that contains information regarding business matters

Purpose: To improve managerial decisions

Sources: Data from inside and outside the organization (extract, transform, and load)

Format: Text + Tables + Graphs/Charts

Distribution: In print, email, portal/internet

26
New cards

Metric Management Reports

Help business performance through metrics (KPSs for internals)

27
New cards

Dashboard-Type Reports

Graphical presentation of several performance indicators in a single page using dials/gauges

28
New cards

Balanced Scorecard-Type Reports

A performance measurement and management methodology that helps translate an organization’s financial, customer, internal process, and learning and growth objectives and targets into a set of actionable initiatives

  • Strategic management system

  • Identifies and measurements around vision and values

  • Focuses on growth

  • Heavy on strategic content

29
New cards

Data Visualization

The use of visual representations to explore, make sense of, and communicate data

  • Often includes charts, graphs, and other illustrations

30
New cards

Information

Aggregation, summarization, and contextualization of data

31
New cards

Visual analytics

The science of analytical reasoning facilitated by interactive visual interfaces

  • May use descriptive, predictive, and prescriptive analytics

32
New cards

Information Visualization

Graphical representation of data and information.

  • It uses visual elements like charts, graphs, maps, and infographics to present data

33
New cards

Dashboard Design

The fundamental challenge of dashboard design is to display all the required information on a single screen, clearly and without distraction, in a manner that can be assimilated quickly

34
New cards

What to look for in a dashboard

  • Use of visual components to highlight data and exceptions that require actions

  • Transparent to the user, meaning that they can require minimal training and are extremely easy to use

  • Combine data from a variety of systems into a single summarized, unified view of the business

  • Enable drill-down or drill-through to underlying data sources and reports

  • Present a dynamic, real-world view with timely data

  • Require little coding to implement, deploy, and maintain

35
New cards

Performance Dashboards

Provide visual displays of important information that is consolidated and arranged on a single screen so that the information can be digested at a single glance, easily drilled in and further explored

  • Commonly used in BPM software suites and BI Platforms

36
New cards

Data Mining

Used to describe the process of discovering new patterns and developing intelligence from collected, organized, and stored data

  • The process of finding mathematical patterns from (usually) large data sets such as correlations, trends, or prediction models

  • Allows a better understanding of customers, operations, and solving organizational problems

  • Other names: Knowledge extraction, pattern analysis, knowledge discovery, information harvesting, pattern searching, etc.

37
New cards

Machine Learning

A subset of artificial intelligence that involves the use of algorithms and statistical models to enable computers to perform specific tasks without explicit instructions. It relies on patterns and inference from data to improve performance over time.

  • Applications include image recognition, natural language processing, robotics, and predictive analytics.

38
New cards

BI is the entry level to what?

Descriptive Analytics

39
New cards

Data Warehouse

A collection of integrated, subject-oriented databases designed to support DSS functions

40
New cards

Characteristics of Data Warehouses

  1. Subject-Oriented: Data organized by subject

  2. Integrated: different sources

  3. Time Variant (time-series): Historical data over time

  4. Nonvolatile: Data can’t be changed or updated

  5. Metadata

  6. Client/server, real-time/right-time/active

41
New cards

Data Mart

A departmental scale “DW” that stores only limited/relevant data

42
New cards

Enterprise Data Warehouse (EDW)

A data warehouse for an enterprise (CRM, SCM)

43
New cards

Metadata

Data about Data

  • Describes the contents of the data warehouse and its acquisition and use

44
New cards

DW Architecture

Three Tier Architecture

  1. Data acquisition software

  2. The data warehouse that contains the data & software

  3. Client (front-end) software that allows users to access and analyze data from the warehouse

Two Tier Architecture

  • The first two tiers from the three-tier structure are combined

45
New cards

Data Integration

Integration that combines three major processes

  1. Accessing the data

  2. Combining different views of the data (federation)

  3. Capturing changes to the data

46
New cards

Enterprise Application Integration (EAI)

A technology that provides a vehicle for pushing data from source systems into a data warehouse

E T L = Extract Transform Load

47
New cards

Extract, Transform, Load

Reading data from a database, converting extracted data into required format, writing the data into target database

48
New cards

Inmon Model: EDW Approach (top-down)

A data warehousing approach that starts with an enterprise data warehouse, integrating data across the organization before creating data marts.

49
New cards

Kimball Model: Data Mart (DM) Approach (bottom-up)

A bottom-up data warehousing approach that creates data marts from operational systems.

50
New cards

Additional DW Considerations Hosted Data Warehouses (Outsourcing)

Benefits:

  • Requires minimal investment on in-house systems

  • Frees up capacity on in-house systems

  • Frees up cash flow

  • Makes powerful solutions affordable

  • Enables solutions that provide for growth

  • Offers better quality equipment and software

  • Provides faster connections

51
New cards

Dimensional Modeling

A retrieval-based system that supports high-volume query access

52
New cards

Star Schema

Most commonly used and simplest style of dimensional modeling

  • Contain a fact table surrounded by and connected to several dimension tables

53
New cards

Snowflakes Schema

An extension of star schema where the diagram resembles a snowflake

54
New cards

Multidimensionality

The ability to organize, present, and analyze data by several dimenstions

  • Dimensions: Products, sales volume, head count, inventory, profit, actual vs. forecast, etc.

  • Measures: Money, sales volume, head count, inventory profit

  • Time: Daily, weekly, monthly, quarterly, or yearly

55
New cards

Scalability

Refers to the degree to which a system can adjust to changes in demand without major additional changes or investments

  • Main issues of scalability:

    • The amount of data in the warehouse

    • How quickly the warehouse is expected to grow

    • The number of concurrent users

    • The complexity of user queries

56
New cards

Good Scalability

Queries and other data-access functions will grow linearly with the size of teh warehouse

57
New cards

Business Performance Management

  • Strategy Focused

  • A real-time system that alerts managers to potential opportunities, impending threats, and empowers a reaction through models and collabs

  • AKA Corporate Performance Management (CPM), Enterprise Performance Management (EPM), Strategic Enterprise Management (SEM)

58
New cards

Performance Measurement System

A system that assists managers in tracking the implementations of business strategy by comparing actual results against strategic goals and objectives

59
New cards

Key Performance Indicator

A representation of a strategic objective and metric(s) that measures performance against a goal

  • Outcome: revenues, lagging indicators

  • Driver: Sales leads, leading indicators

60
New cards

Six Sigma

A performance Management methodology that aims to reduce the number of defects in a business process to as close as zero defects per million opportunities (3.4 per million)

  • Performance measurement system

  • Establishes accountability for leadership for wellness and profitability

  • Maximizing profitability

  • Heavy on execution for profitability

61
New cards

Effective Performance Measurement Should:

  • Focus on key factors

  • Be a mix of past, present, and future

  • Should balance needs of shareholders, employees, partners, suppliers, and other stakeholders

  • Should start at the top and flow down to the bottom

  • Need to have targets that are based on research and reality rather than arbitrary

62
New cards

Closed Loop Process to Optimize Business Performance:

Process Steps

  • Strategize

  • Plan

  • Monitor / analyze

  • Act / adjust

Has sub-process steps

63
New cards

Strategize (Process Step 1)

  1. Conduct a current situation analysis

  2. Determine planning horizon

  3. Conduct environment scan

  4. Identify critical success factors

  5. Complete a gap analysis

  6. Create a strategic vision

  7. Develop a business strategy

  8. Identify strategic objectives and goals

64
New cards

Operational Planning (Process Step 2)

Plan that translated an organization’s strategic objectives and goals into a set of well-defined tactics and initiatives, resource requirements, and expected results for some future time period (usually a year)

  • Operational planning can be:

    • Tactic-centric (operationally focused)

    • Budget-centric (financially focused)

65
New cards

Monitor / Analyze: How are We Doing? (Process Step 3)

A comprehensive framework for monitoring performance that should address two key issues:

  • What to monitor?

    • Critical success factors

    • Strategic goals and targets

  • Here is where KPI’s dashboards, reporting, and analytics are helpful

66
New cards

Act and Adjust: What Do We Need to Do Differently?

Success (or mere survival) depends on new projects: creating new products, entering new markets, acquiring new customers (or businesses), or streamlining some process

67
New cards

Types of Data Mining Patterns

Association: Establishing relationships among items

Prediction: Act of telling about the future

Cluster (Segmentation): Finding groups of entities with similar characteristics (unknown class labels)

Sequential (or time series) relationships

68
New cards

Times Series Forecasting

Values of the same variable are captured over time

69
New cards

Data Mining Proces: CRISP-DM (Cross-Industry Standard Process for Data)

Highly repetative and experimental

Proposed in 1990s by a European consortium

  • Step 1: Business Understanding

  • Step 2: Data understanding

  • Step 3: Data Preparation

  • Step 4: Model Building

  • Step 5: Testing and Evaluation

  • Step 6: Deployment

70
New cards

Data Mining Process: SEMMA

Developed by SAS Institute

  • Sample: Generate representative sample of data

  • Explore: Visualization and basic description of the data

  • Modify: Select variables, transform variable representations)

  • Model: Use variety of statistical and machine learning models

  • Assess: Evaluate the accuracy and usefulness of the models

71
New cards

Data Mining Process (Knowledge Discovery in Databases; KDD)

Sources of raw data (data selection)→ Target Data (Data Cleaning)→ Preprocessed data (Data Transformation)→ Transformed data (Data Mining)→ Extracted Patterns (Internalization) → Knowledge “Actionable Insight”

72
New cards

Data Mining Methods: Classification

  • Most frequently used DM method

  • Part of the machine-learning family

  • Employ supervised learning

  • Learn patterns from past data (of previously labeled items), classify new data

  • The output variable is categorical (nominal or ordinal) in nature.

  • Classification versus regression? If numeric → Regression, If non-numeric → classification

73
New cards

Predictive Accuracy

The accuracy of a model in predicting class labels for new data.

74
New cards

Speed

Model building versus predicting/usage speed

75
New cards

Robustness

The ability of a model to make accurate predictions consistently, even in the presence of variability or uncertainty.

76
New cards

Sample Split

Split the data into 2 mutually exclusive sets: training (~70%) and testing (30%)

77
New cards

Decision Trees

  • Income + credit score (attributes) → class (loan risk low, medium, high)

  • Recursively divides a training set until each division consists of examples from one class

    1. Creating a root node and assign all the training data to it

    2. Select the best splitting attribute

    3. Add a branch to the root note for each value of the split. Split the data into mutually exclusive subsets along the lines of the specific split

78
New cards

Cluster Algorithms

When the data records do not have predefined class identifiers. Sort cases into groups. Members in each group are similar

  • Used for automatic identification of natural groupings of things

  • Part of the machine-learning family

  • Employ unsupervised learning (Includes only descriptive attributes)

  • Learns the clusters of things from past data, then assigns new instances

  • There is not an output/target variable

79
New cards

What do Cluster Algorithms Do?

  • Identify natural groupings of customers

  • Provide characterization, definition, labeling of populations

80
New cards

k-Means Clustering Algorithm

k: pre-determined number of clusters

Algorithm (Step 0: determine the value of k)

  • Step 1: Randomly generate k random points as initial cluster centers

  • Step 2: Assign each point to the nearest cluster center

  • Step 3: Re-compute the new cluster centers

  • Repetition Step: Repeat steps 2 and 3 until some convergence criterion is met (usually that the assignment point to clusters becomes stable)

81
New cards

Association Rule Mining

  • A very popular DM method in business

  • Finds interesting relationships (affinities) between variables (items or events)

  • Part of machine learning family

  • Employs unsupervised learning

  • There is no output variable

  • Also known as market basket analysis

  • Find strong relationships between products (e.g Beer and diapers)

82
New cards

The generic rule of Association Rule Mining

X → Y [S%, C%]

X, Y: Products and/or services

X: Left-hand-side (LHS)

Y: Right-hand-side (RHS)

S: Support (frequency): how often X and Y go together

C: Confidence: how often Y go together with the X

Example: {{Laptop Computer, Antivirus Software} → {Extended Service Plan} [30%, 70%]

83
New cards

Association Rule Mining Algorithms

  • Several algorithms are developed for discovering (identifying) association rules

    • Apriori

    • Eclat

    • FP-Growth

  • The algorithms help identify frequent item sets, which are then converted to association rules

84
New cards

Data Mining Software Tools

Commercial

  • IBM SPSS Modeler (formerly Clementine)

  • SAS Enterprise Miner

  • Statistica - Dell/Statsoft

  • …many more

Free and/or Open Source

  • RapidMiner

  • Weka

  • R, …

85
New cards

Data Mining Mistakes

  • Selecting the wrong problem for data mining

  • Beginning without the end in mind

  • Not leaving sufficient time for data acquisition, selection, and preparation

  • Looking only at aggregated results and not at individual records/predictions

86
New cards

Supervised Learning

A machine learning approach where a model is trained using labeled data to learn patterns and make predictions on new data.

  • Classification: K-nearest neighbors, naïve buyers, decision trees, logistic reasoning, sentiment analysis

  • Regression: Least Squares, Linear Regression, Forecasting, Non-Linear Repression, Intervention Analysis

87
New cards

Unsupervised Learning

A machine learning approach where a model identifies patterns and structures in unlabeled data without predefined outputs.

  • Clustering / Segmentation

  • K-Means

  • Outlier Detection

  • Markov Chains

88
New cards

Customer Relationship Management (Data Mining Applications)

  • Maximize return on marketing campaigns

  • Improve customer retention

  • Maximize Customer Value

  • Identify and treat most valued customers

89
New cards

Baking and Other Financial (Data Mining Applications)

  • Automate the loan application process

  • Detecting fraudulent transactions

  • Maximize customer value (cross-, up-selling)

  • Optimizing cash reserves with forecasting

90
New cards

Retailing and Logistics (Data Mining Applications)

  • Optimize inventory levels at different locations

  • Improve store layout and sales promotions

  • Optimize logistics by predicting seasonal effects

  • Minimize losses due to limited shelf life

91
New cards

Manufacturing and Maintenance (Data Mining Applications)

  • Predict / prevent machinery failures

  • Identify anomalies in production systems to optimize the use of manufacturing capacity

  • Discover novel patterns and improve product quality

92
New cards

Data Mining Applications

  • Computer hardware and software

  • Government and defense

  • Homeland security and law enforcement

  • Travel, entertainment, sports

  • Healthcare and medicine

  • Sports, …virtually everywhere…