Perform the Analysis: Types of Data Analytics

Perform the Analysis Phase (AMPS Model)
- Focuses on selecting and applying the appropriate type of analytics to answer accounting questions.
Four Types of Analytics
- Descriptive – What happened?
- Diagnostic – Why did it happen?
- Predictive – Will it happen in the future?
- Prescriptive – What should we do based on what we expect will happen?
Descriptive Analytics
- Summarizes and organizes historical data to describe what has occurred.
- Tools: counts, sums, averages, ratios, graphs, sparklines, horizontal and vertical analysis.
Diagnostic Analytics
- Investigates the causes behind trends or anomalies in descriptive data.
- Tools: variance analysis, sequence checks, Benford’s Law, drill-downs, pivot tables, correlation/regression, hypothesis testing.
Predictive Analytics
- Assesses likelihoods and future outcomes based on historical data.
- Tools: classification (e.g., Altman's Z-score), regression, time series forecasting.
Prescriptive Analytics
- Identifies optimal actions by evaluating possible outcomes under constraints.
- Tools: sensitivity analysis, capital budgeting, marginal analysis, goal seek, what-if scenario planning.
Statistical Concepts in Analytics
- Population vs. Sample- Population: The entire group of individuals, objects, or data points that a researcher is interested in studying. It represents all possible observations.
- Sample: A subset or a smaller, representative group drawn from the population. Data is collected from the sample to make inferences about the larger population.
- Parameters vs. Statistics- Parameters: Numerical values that describe a characteristic of an entire population. These are usually fixed but unknown.
- Statistics: Numerical values that describe a characteristic of a sample. These are used to estimate population parameters.
- Probability Distributions: normal, uniform, Poisson- Probability Distribution: A mathematical function that describes the likelihood of different possible outcomes in an experiment.
  - Normal Distribution: A symmetric, bell-shaped distribution where most observations cluster around the central peak (mean), and values farther from the mean are less likely. Many natural phenomena follow this distribution.
  - Uniform Distribution: A distribution where all possible outcomes have an equal probability of occurrence within a given range. It looks like a rectangle when graphed.
  - Poisson Distribution: A discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event.
- Hypothesis testing: null and alternative hypotheses, alpha, p-values, confidence intervals
  - Hypothesis Testing: A statistical method used to make decisions about a population based on sample data.
  - Null Hypothesis (H_0): A statement that there is no effect, no difference, or no relationship. It is the statement being tested.
  - Alternative Hypothesis (H1 or HA): A statement that contradicts the null hypothesis, suggesting that there is an effect, a difference, or a relationship.
  - Alpha (\alpha): The significance level, which is the probability of rejecting the null hypothesis when it is actually true (Type I error). Common values are 0.05 or 0.01.
  - p-values: The probability of obtaining an observed result (or a more extreme result) if the null hypothesis were true. A small p-value (typically less than \alpha) suggests strong evidence against the null hypothesis.
  - Confidence Intervals: A range of values, derived from sample statistics, that is likely to contain the true population parameter with a certain level of confidence (e.g., 95% confidence interval).
- t-tests and interpreting regression output- t-tests: Statistical tests used to compare the means of two groups or to compare a sample mean to a known population mean, especially when the population standard deviation is unknown and the sample size is small.
- Interpreting Regression Output: Involves understanding the relationship between a dependent variable and one or more independent variables.
  - Key elements to interpret include: regression coefficients (indicating the change in the dependent variable for a one-unit change in the independent variable), p-values for coefficients (to assess statistical significance), R-squared (indicating the proportion of variance in the dependent variable explained by the model), and the overall F-statistic (to test the overall significance of the model).
- Parameters vs. Statistics
- Probability Distributions: normal, uniform, Poisson
- Hypothesis testing: null and alternative hypotheses, alpha, p-values, confidence intervals
- t-tests and interpreting regression output
Excel Tools for Analysis
- Data Analysis Toolpak functions
- Regression and forecast tools
- PivotTables for drill-downs
- Goal Seek and scenario tools