Perform the Analysis: Types of Data Analytics

  1. Perform the Analysis Phase (AMPS Model)

    • Focuses on selecting and applying the appropriate type of analytics to answer accounting questions.

  2. Four Types of Analytics

    • Descriptive – What happened?

    • Diagnostic – Why did it happen?

    • Predictive – Will it happen in the future?

    • Prescriptive – What should we do based on what we expect will happen?

  3. Descriptive Analytics

    • Summarizes and organizes historical data to describe what has occurred.

    • Tools: counts, sums, averages, ratios, graphs, sparklines, horizontal and vertical analysis.

  4. Diagnostic Analytics

    • Investigates the causes behind trends or anomalies in descriptive data.

    • Tools: variance analysis, sequence checks, Benford’s Law, drill-downs, pivot tables, correlation/regression, hypothesis testing.

  5. Predictive Analytics

    • Assesses likelihoods and future outcomes based on historical data.

    • Tools: classification (e.g., Altman's Z-score), regression, time series forecasting.

  6. Prescriptive Analytics

    • Identifies optimal actions by evaluating possible outcomes under constraints.

    • Tools: sensitivity analysis, capital budgeting, marginal analysis, goal seek, what-if scenario planning.

  7. Statistical Concepts in Analytics

    • Population vs. Sample- Population: The entire group of individuals, objects, or data points that a researcher is interested in studying. It represents all possible observations.

    • Sample: A subset or a smaller, representative group drawn from the population. Data is collected from the sample to make inferences about the larger population.

    • Parameters vs. Statistics- Parameters: Numerical values that describe a characteristic of an entire population. These are usually fixed but unknown.

    • Statistics: Numerical values that describe a characteristic of a sample. These are used to estimate population parameters.

    • Probability Distributions: normal, uniform, Poisson- Probability Distribution: A mathematical function that describes the likelihood of different possible outcomes in an experiment.

      • Normal Distribution: A symmetric, bell-shaped distribution where most observations cluster around the central peak (mean), and values farther from the mean are less likely. Many natural phenomena follow this distribution.

      • Uniform Distribution: A distribution where all possible outcomes have an equal probability of occurrence within a given range. It looks like a rectangle when graphed.

      • Poisson Distribution: A discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event.

    • Hypothesis testing: null and alternative hypotheses, alpha, p-values, confidence intervals

      • Hypothesis Testing: A statistical method used to make decisions about a population based on sample data.

      • Null Hypothesis (H_0): A statement that there is no effect, no difference, or no relationship. It is the statement being tested.

      • Alternative Hypothesis (H1 or HA): A statement that contradicts the null hypothesis, suggesting that there is an effect, a difference, or a relationship.

      • Alpha (\alpha): The significance level, which is the probability of rejecting the null hypothesis when it is actually true (Type I error). Common values are 0.05 or 0.01.

      • p-values: The probability of obtaining an observed result (or a more extreme result) if the null hypothesis were true. A small p-value (typically less than \alpha) suggests strong evidence against the null hypothesis.

      • Confidence Intervals: A range of values, derived from sample statistics, that is likely to contain the true population parameter with a certain level of confidence (e.g., 95% confidence interval).

    • t-tests and interpreting regression output- t-tests: Statistical tests used to compare the means of two groups or to compare a sample mean to a known population mean, especially when the population standard deviation is unknown and the sample size is small.

    • Interpreting Regression Output: Involves understanding the relationship between a dependent variable and one or more independent variables.

      • Key elements to interpret include: regression coefficients (indicating the change in the dependent variable for a one-unit change in the independent variable), p-values for coefficients (to assess statistical significance), R-squared (indicating the proportion of variance in the dependent variable explained by the model), and the overall F-statistic (to test the overall significance of the model).

    • Parameters vs. Statistics

    • Probability Distributions: normal, uniform, Poisson

    • Hypothesis testing: null and alternative hypotheses, alpha, p-values, confidence intervals

    • t-tests and interpreting regression output

  8. Excel Tools for Analysis

    • Data Analysis Toolpak functions

    • Regression and forecast tools

    • PivotTables for drill-downs

    • Goal Seek and scenario tools