Perform the Analysis: Types of Data Analytics
Perform the Analysis Phase (AMPS Model)
Focuses on selecting and applying the appropriate type of analytics to answer accounting questions.
Four Types of Analytics
Descriptive – What happened?
Diagnostic – Why did it happen?
Predictive – Will it happen in the future?
Prescriptive – What should we do based on what we expect will happen?
Descriptive Analytics
Summarizes and organizes historical data to describe what has occurred.
Tools: counts, sums, averages, ratios, graphs, sparklines, horizontal and vertical analysis.
Diagnostic Analytics
Investigates the causes behind trends or anomalies in descriptive data.
Tools: variance analysis, sequence checks, Benford’s Law, drill-downs, pivot tables, correlation/regression, hypothesis testing.
Predictive Analytics
Assesses likelihoods and future outcomes based on historical data.
Tools: classification (e.g., Altman's Z-score), regression, time series forecasting.
Prescriptive Analytics
Identifies optimal actions by evaluating possible outcomes under constraints.
Tools: sensitivity analysis, capital budgeting, marginal analysis, goal seek, what-if scenario planning.
Statistical Concepts in Analytics
Population vs. Sample- Population: The entire group of individuals, objects, or data points that a researcher is interested in studying. It represents all possible observations.
Sample: A subset or a smaller, representative group drawn from the population. Data is collected from the sample to make inferences about the larger population.
Parameters vs. Statistics- Parameters: Numerical values that describe a characteristic of an entire population. These are usually fixed but unknown.
Statistics: Numerical values that describe a characteristic of a sample. These are used to estimate population parameters.
Probability Distributions: normal, uniform, Poisson- Probability Distribution: A mathematical function that describes the likelihood of different possible outcomes in an experiment.
Normal Distribution: A symmetric, bell-shaped distribution where most observations cluster around the central peak (mean), and values farther from the mean are less likely. Many natural phenomena follow this distribution.
Uniform Distribution: A distribution where all possible outcomes have an equal probability of occurrence within a given range. It looks like a rectangle when graphed.
Poisson Distribution: A discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event.
Hypothesis testing: null and alternative hypotheses, alpha, p-values, confidence intervals
Hypothesis Testing: A statistical method used to make decisions about a population based on sample data.
Null Hypothesis (H_0): A statement that there is no effect, no difference, or no relationship. It is the statement being tested.
Alternative Hypothesis (H1 or HA): A statement that contradicts the null hypothesis, suggesting that there is an effect, a difference, or a relationship.
Alpha (\alpha): The significance level, which is the probability of rejecting the null hypothesis when it is actually true (Type I error). Common values are 0.05 or 0.01.
p-values: The probability of obtaining an observed result (or a more extreme result) if the null hypothesis were true. A small p-value (typically less than \alpha) suggests strong evidence against the null hypothesis.
Confidence Intervals: A range of values, derived from sample statistics, that is likely to contain the true population parameter with a certain level of confidence (e.g., 95% confidence interval).
t-tests and interpreting regression output- t-tests: Statistical tests used to compare the means of two groups or to compare a sample mean to a known population mean, especially when the population standard deviation is unknown and the sample size is small.
Interpreting Regression Output: Involves understanding the relationship between a dependent variable and one or more independent variables.
Key elements to interpret include: regression coefficients (indicating the change in the dependent variable for a one-unit change in the independent variable), p-values for coefficients (to assess statistical significance), R-squared (indicating the proportion of variance in the dependent variable explained by the model), and the overall F-statistic (to test the overall significance of the model).
Parameters vs. Statistics
Probability Distributions: normal, uniform, Poisson
Hypothesis testing: null and alternative hypotheses, alpha, p-values, confidence intervals
t-tests and interpreting regression output
Excel Tools for Analysis
Data Analysis Toolpak functions
Regression and forecast tools
PivotTables for drill-downs
Goal Seek and scenario tools