Regression, Forecasting, and Data Mining in Business Analytics

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/106

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

107 Terms

1
New cards

What is the primary purpose of regression analysis in business?

To build mathematical and statistical models that characterize relationships between a dependent variable and one or more independent variables.

2
New cards

What are the three major categories of forecasting approaches?

Qualitative and judgmental techniques, statistical time-series models, and explanatory/casual methods.

3
New cards

What does R-squared measure in regression analysis?

The measure of fit of the line to the data, ranging from 0 to 1; a larger value indicates a better fit.

4
New cards

What is a simple linear regression?

A regression model that involves finding a linear relationship between one independent variable and one dependent variable.

5
New cards

What is the difference between simple and multiple linear regression?

Simple linear regression involves one independent variable, while multiple linear regression involves two or more independent variables.

6
New cards

What is the purpose of residual analysis in regression?

To check the assumptions of regression and assess the model's accuracy by examining the differences between observed and predicted values.

7
New cards

What assumption must be checked regarding the residuals in linear regression?

Linearity, normality of errors, homoscedasticity, and independence of errors.

8
New cards

What is multicollinearity in regression analysis?

When two or more independent variables in the same regression model are highly correlated, making it difficult to isolate their effects.

9
New cards

What is the variance inflation factor (VIF)?

A measure used to estimate the degree of multicollinearity in regression analysis.

10
New cards

How can categorical independent variables be incorporated into regression models?

By transforming them into dummy variables for easier interpretation.

<p>By transforming them into dummy variables for easier interpretation.</p>
11
New cards

What is a time series in forecasting?

A stream of historical data used to predict future values based on past trends.

12
New cards

What are the four components of a time series?

Random behavior, trends, seasonal effects, and cyclical effects.

13
New cards

What is the Delphi method in qualitative forecasting?

A forecasting technique that uses a panel of experts responding to a sequence of questionnaires, with feedback shared after each round.

14
New cards

What is the purpose of trendline analysis?

To show the movement of data and model relationships between variables over time.

15
New cards

What is the least-squares regression method?

A technique that finds the best values of the slope and intercepts to minimize the sum of squares of the errors.

16
New cards

What is adjusted R-square?

A modified version of R-square that incorporates the sample size and the number of explanatory variables in the model.

17
New cards

What is a polynomial function used in predictive analytics?

A mathematical function of the form ax^2 + bx + c, used to model relationships in data.

18
New cards

What is the significance of the standard error in regression?

It measures the variability of the observed Y-values from the predicted values.

19
New cards

What is the purpose of building good regression models?

To ensure accurate predictions and reliable insights from the data being analyzed.

20
New cards

What does it mean if R-squared equals 1.0?

It indicates a perfect fit between the regression model and the data.

21
New cards

What is the role of trend analysis in business forecasting?

To identify patterns and trends in historical data to make informed predictions about future performance.

22
New cards

What is a curvilinear regression model?

A nonlinear regression model that is often used when the relationship between the independent and dependent variables is not linear.

23
New cards

What is the importance of checking for normality of errors in regression?

To ensure that the residuals are normally distributed, which is an assumption for many statistical tests.

24
New cards

What is the significance of the scatter diagram in regression analysis?

It helps to visually assess the linearity of the relationship between the independent and dependent variables.

25
New cards

What is the role of indicators in qualitative forecasting?

Indicators are measures believed to influence the behavior of a variable we wish to forecast.

26
New cards

What are cyclical effects in time series?

Cyclical effects are ups and downs over a much longer time frame, such as several years.

27
New cards

What characterizes a stationary time series?

Stationary time series do not have trend, seasonal, or cyclical effects but are relatively constant and exhibit only random behavior.

28
New cards

What are two techniques to forecast stationary time series?

Moving average and simple exponential smoothing.

29
New cards

What method is used to forecast time series with a linear trend?

Double exponential smoothing.

30
New cards

What methods are commonly used to forecast time series with seasonality?

Linear regression, Holt-Winters additive seasonality model without trend, and Holt-Winters multiplicative seasonality without trend.

31
New cards

Which models can be used to forecast time series with both seasonality and trend?

Holt-Winters additive seasonality model with trend and Holt-Winters multiplicative seasonality with trend.

32
New cards

What is the focus of data mining?

Data mining studies characteristics and patterns of data using statistical and analytical tools.

33
New cards

Name four common approaches in data mining.

Cluster analysis, classification, association, and cause-and-effect modeling.

34
New cards

What is cluster analysis?

Cluster analysis is a technique that groups a collection of objects into subsets or clusters, where objects in the same cluster are more similar to each other than to those in different clusters.

35
New cards

How is classification used in data mining?

Classification predicts how to classify a new data element, such as identifying fraudulent credit card transactions.

36
New cards

What does association analysis do?

Association analysis identifies natural associations among variables to create rules for target marketing or buying recommendations.

37
New cards

What is cause-and-effect modeling?

Cause-and-effect modeling develops analytic models to describe relationships between metrics that drive business performance.

38
New cards

What is the purpose of regression analysis in data mining?

Regression analysis helps predict relationships or future values of variables of interest.

39
New cards

What is the role of descriptive analytics in data mining?

Descriptive analytics helps identify patterns in data and predict future outcomes.

40
New cards

What is the significance of the Holt-Winters model?

The Holt-Winters model is significant for forecasting time series data that exhibit seasonality and trends.

41
New cards

What tools can be used for cluster analysis in Excel?

Excel can be used for simple cluster analysis through its data analysis tools.

42
New cards

What is the goal of conducting trendline analysis?

The goal of trendline analysis is to identify trends in data over time.

43
New cards

What does multiple linear regression analyze?

Multiple linear regression analyzes the relationship between two or more independent variables and a dependent variable.

44
New cards

What is the importance of regression assumptions?

Regression assumptions are important to ensure the validity of the regression model and its predictions.

45
New cards

What is qualitative forecasting?

Qualitative forecasting uses subjective judgment and opinion to make predictions about future events.

46
New cards

What is judgmental forecasting?

Judgmental forecasting involves making predictions based on expert opinions and insights.

47
New cards

What is the purpose of correlation analysis in data mining?

Correlation analysis helps identify the strength and direction of relationships between variables.

48
New cards

What is the significance of using Excel for data mining tasks?

Excel provides tools for analysis, visualization, and data manipulation, making it accessible for data mining tasks.

49
New cards

What is the outcome of effective data mining?

Effective data mining leads to better decision-making and improved business performance.

50
New cards

What is the relationship between customer satisfaction and contract renewal rates?

Understanding this relationship can lead to improved performance and retention strategies.

51
New cards

What should objects within clusters exhibit?

A high amount of similarity, while objects in different clusters should be dissimilar.

52
New cards

What is hierarchical clustering?

A clustering method where data are not partitioned into a single cluster in one step, but through a series of partitions.

53
New cards

What are agglomerative clustering methods?

A subdivision of hierarchical clustering that fuses n objects into groups, commonly used in clustering.

54
New cards

What are divisive clustering methods?

A subdivision of hierarchical clustering that separates n objects into finer groupings successively.

55
New cards

What is Euclidean distance?

A measure of distance between objects, extending the distance between two points on a plane.

56
New cards

What is single linkage clustering?

An agglomerative method that forms clusters from individual objects until only one cluster remains.

57
New cards

What is complete linkage clustering?

A method where the distance between clusters is defined as the distance between the most distant pair of objects from each cluster.

58
New cards

What is average linkage clustering?

A method where the distance between two clusters is defined as the average of distances between all pairs of objects from each group.

59
New cards

What is Ward's hierarchical clustering?

A clustering method that uses a sum-of-squares criterion.

60
New cards

What does a dendrogram visualize?

It visualizes different numbers of clusters at various stages of the clustering process.

61
New cards

What is classification in data mining?

The process of classifying a categorical outcome into one or more categories based on various data attributes.

62
New cards

What is a classification matrix?

A matrix that shows the number of cases classified correctly or incorrectly, assessing the effectiveness of a classification rule.

63
New cards

What does the k-nearest neighbors (k-NN) algorithm do?

It attempts to find records in a database that are similar to one we wish to classify.

64
New cards

What is discriminant analysis?

A technique for classifying observations into predefined classes based on predictor variables.

65
New cards

What is a cut-off value in discriminant analysis?

A rule for classifying observations using discriminant scores, determining group assignment based on a midpoint.

66
New cards

What is association rule mining?

A technique that uncovers interesting associations and correlation relationships among large sets of data.

67
New cards

What is market basket analysis?

An example of association rule mining that analyzes items purchased together in a transaction.

68
New cards

What does support for an association rule express?

The degree of uncertainty about an association rule, showing the number of transactions that include all items in the rule.

69
New cards

What is confidence in association rules?

The ratio of transactions that include all items in the consequent to those in the antecedent.

70
New cards

What is lift in association rules?

The ratio of confidence to expected confidence, indicating the increase in probability of the consequent given the antecedent.

71
New cards

What are lagging measures?

Outcomes that tell what has happened, often external business results like profit or customer satisfaction.

72
New cards

What are leading measures?

Performance drivers that predict what will happen, usually internal metrics like employee satisfaction.

73
New cards

What are the four common approaches in data mining?

Cluster analysis, classification, association, and cause-and-effect modeling.

74
New cards

What is the purpose of cause-and-effect modeling?

To develop analytic models that describe the relationship between metrics that drive business performance.

75
New cards

What is the goal of what-if analysis?

To evaluate future scenarios and answer questions regarding potential outcomes.

76
New cards

What is Monte Carlo simulation?

A technique used to understand the impact of risk and uncertainty in prediction and forecasting models.

77
New cards

What is a data table in Excel?

A data table is a range of cells that shows how changing one or two variables in your formulas will affect the results.

78
New cards

What is what-if analysis?

What-if analysis is a process of changing the values in cells to see how those changes will affect the outcome of formulas on the worksheet.

79
New cards

What is a Monte Carlo simulation?

Monte Carlo simulation is a technique used to understand the impact of risk and uncertainty in prediction and forecasting models.

80
New cards

What are the three strategies for building models?

1. Using logic and business principles, 2. Using influence diagrams, 3. Using historical data.

81
New cards

What is verification in spreadsheet engineering?

Verification is the process of ensuring that a model is accurate and free from logical errors.

82
New cards

What are descriptive spreadsheet models focused on?

Descriptive spreadsheet models focus on understanding the past.

83
New cards

What do predictive spreadsheet models aim to do?

Predictive spreadsheet models focus on understanding the future.

84
New cards

What is the focus of prescriptive spreadsheet models?

Prescriptive spreadsheet models focus on identifying the best solution.

85
New cards

What is a random variate?

A random variate is a value randomly generated from a specified probability distribution.

86
New cards

What is a feasible solution in optimization?

A feasible solution is any solution that satisfies all constraints of a problem.

87
New cards

What is the objective function in optimization models?

The objective function is the function that needs to be maximized or minimized in an optimization problem.

88
New cards

What are decision variables in optimization models?

Decision variables are the unknown values that the model seeks to determine.

89
New cards

What is a binding constraint?

A binding constraint is a constraint that holds as an equality at the optimal solution.

90
New cards

What is an unbounded solution in optimization?

An unbounded solution occurs when the value of the objective can be increased or decreased without bound without violating any constraints.

91
New cards

What does the Solver Answer Report provide?

The Solver Answer Report provides basic information about the solution, including the values of the original and optimal objective function and decision variables.

92
New cards

What are the four possible outcomes when solving a linear optimization model?

1. Unique optimal solution, 2. Alternative optimal solutions, 3. Unbounded solution, 4. Infeasibility.

93
New cards

What is the purpose of the Scenario Manager in Excel?

The Scenario Manager allows users to create and manage different scenarios for data analysis.

94
New cards

What is Goal Seek in Excel?

Goal Seek is a tool that allows users to find the input value needed to achieve a specific goal in a formula.

95
New cards

What is the first step in the Monte Carlo simulation process?

Develop the spreadsheet model.

96
New cards

What is the second step in the Monte Carlo simulation process?

Determine the probability distribution.

97
New cards

What is the third step in the Monte Carlo simulation process?

Identify the model output.

98
New cards

What is the fourth step in the Monte Carlo simulation process?

Determine the number of trials for the simulation.

99
New cards

What is the fifth step in the Monte Carlo simulation process?

Create a data table to summarize the values of the model output for the replications.

100
New cards

What is the sixth step in the Monte Carlo simulation process?

Compute summary statistics, percentiles, confidence intervals, frequency distributions, and histograms to interpret results.