ADMN3121H - CHAPTER 12 LECTURE SLIDES (Blackboard)

Chapter 12: Data Analytic Thinking and Prediction

Explain how management accountants can work with data to create value.
Identify the questions management wants to ask and the relevant data.
Explain the elements of a decision tree model.
Describe how to refine a decision tree model to ensure data represents the business context.
Explain how to validate predictions of full versus refined decision trees.
Evaluate predictions of different data science models to choose the best one for business needs and visualize and communicate model insights.
Describe how to use and deploy data science models.

Data Science: Use of data analytics to draw conclusions from data.
- Intersects with:
  1. Computer science and data skills
  2. Math and statistics
  3. Subject matter expertise in a specific area along with management accounting knowledge.

Predictive Modeling: A data science technique to make predictions using past or current data.
- Utilizes large data sets to train sophisticated algorithmic models.
- Models learn from training data to predict new records based on features of interest.
- Helps understand cost drivers; primary goal is value creation across the value chain.

Essential questions for management accountants to examine for business decisions.
Forming questions is critical for a successful data science undertaking.
Evaluate vast data availability to determine the worth of analyses.

Management accountant's role is to evaluate data quality:
- Objectivity of data: Is it estimated or carefully measured?
- Relevance of data and costs for decisions.
Exploratory Data Analysis: Provides insight into a data set through numeric analysis (mean, median, etc.).

Organizing and processing data:
- Identifying additional needed data and measuring variables.
- Data Leakage: Exclude data not available during the analysis.
- Ensure features used for predictive modeling are appropriate and legal considerations involving personal data.

Various models, from regression to neural networks, are used to analyze data.
Models fit data flexibly and rely on computational power.
Collaboration between management accountants and data scientists is key in model development.

Functional Relationship Models: Assume specific relationships between feature and target variables.
Decision Trees: Easy to interpret and build, segmenting the target variable using rules.
- Decision Tree Algorithm: Subdivides data based on features.
- Recursive Partitioning: Continual subdivision until pure rectangles are created.

Visualization of decision trees highlighting:
- Decision nodes (indicated by circles) and their connecting paths.
- Terminal nodes (rectangles) signify groupings that are pure.

Gini Impurity: Measures the purity of observation collections.
- High impurity indicates a mixed set; lower impurity suggests dominance of one class.
- Steps to Calculate Gini Impurity:
  1. Establish baseline Gini impurity.
  2. Assess new Gini impurities for potential cuts.

Management accountants help in refinement by addressing:
1. Overfitting: Models adhering too closely to noise in data.
2. Pruning: Limiting the tree's growth to a predetermined depth.

Overfitting: Reduces model effectiveness by capturing random noise, affecting future predictions.
- Recognizing overfitting is crucial for management accountants.
Pruning: Controls tree growth to enhance model performance, raising questions about optimal depth.

Data scientists apply techniques to compare full vs. pruned decision trees:
1. Cross-validation using prediction accuracy.
2. Cross-validation using maximum likelihood.
3. Testing on holdout samples.

Cross-Validation for Prediction Accuracy: Comparing model predictions on known outcomes.
Cross-Validation for Maximum Likelihood: Uses likelihood values to gauge model performance.

Evaluate models based on:
- Likelihood values, feature variable relevance, and misclassification rates.
- ROC Curve and Confusion Matrix as tools for visual evaluation.

Visualizing insights helps communicate model value:
- Decision trees show separation of classes.
- ROC curves depict model classification accuracy.
- Confusion matrices outline predicted vs. actual classifications.

Collaboration with managers to operationalize models:
- Evaluation of necessary modifications and balance between quantitative and qualitative assessments.
- Sensitivity evaluation of payoffs based on decisions.
- Understanding statistical tools is key in creating value through informed decision-making.