1/120
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
data-driven decision making (analytics)
Using facts, metrics, and data to guide strategic business decisions that align with your goals, objectives, and initiatives. Asking the right questions.
It is the science of applying a structured method to solve a business problem using data and analysis to drive impact.
data
A collection of facts used to identify patterns, draw conclusions, make predictions, and make decisions.
Data Science
Toward Insight:
This is the technical track, designed to derive insights from data.
Decision Science
Toward Impact:
This is the business track, designed to align stakeholders so that the valuable insights produced using the data science track can be inserted into the decision-making process and converted into action.
Describe the tasks an analyst may need to perform and the software they might use
Data analysts focus on business analytics and perform tasks such as:
-Accessing, Transforming, and Manipulating (MySQL, Microsoft Excel)
-Statistical Analyses (R, Python)
-Visualizing (Tableau, Power BI Desktop)
Describe metric data.
Quantitative—continuous values
Metric data or non-metric data:
Interval Scale
(Common Arithmetic Operations -- Numerical ranking for how service was today, % supervisors assign to good performers %0 bad, 100% good,
Low temperature = Bad attitude and high temperature = Good attitude)
Metric data
Metric data or non-metric data:
Ratio Scale (All Arithmetic Operations -- Amount purchased, Salesperson Sales volume, Likelihood of performing some act: 0% = No Likelihood to 100% = Certainty, Number of stores visited, Time spent viewing a particular web page, Number of web pages viewed)
Metric data
Metric data or non-metric data:
Mean, Median, Variance, Standard Deviation
Metric data
Metric data or non-metric data:
Ordinal - ranking scale with counting and ordering
(Frequency, Mode, Median, Range)
Non-metric data
Metric data or non-metric data:
Dissatisfied to Delighted or
HS Diploma up to Graduate Degree
Non-metric data
Describe and non-metric data
Qualitative—discrete values
Identify the three characteristics of big data
Volume, Variety, Velocity
Example of non-metric data
Nominal Scale (absolute value) with only counting (Frequency, Mode
EX: Yes-No, Female-Male, Buy-Did Not Buy, Postal Code ______)
Identify the characteristics of valuable data
Relevance, Completeness, Accuracy, Timeliness
Describe the components of a balanced scorecard
Both financial and nonfinancial metrics matter. Looking forward, backward, internally, and externally
Identify Financial Metrics
Profit
Net Present Value (NPV)
Internal Rate of Return (IRR)
Payback
Identify Non-Financial Metrics
Brand Awareness
Product Trials
Churn
Customer Satisfaction (CSAT)
Customer Lifetime Value (CLTV)
Conversions
Identify Customer Metrics
Customer Behavior -- (Frequency of Firm Desired Behavior, Strength of Firm Desired Behavior, Behavioral Intentions) and
Customer Evaluations -- (of Service Provider, of Service Experience, of Goods, of Firm, of Self)
Name the steps in the BADIR process
1) Business question
2) Analysis Plan
3) Data Collection
4) Insights
5) Recommendation
M2: Identify the advantages of taking time to establish the Business Question
1) Reduction of iterations
2) Contributions with actionable recommendations
3) Recognition as a valued partner
4) Solutions originate from discussion not data
5) Quality of decision is proportional to the time invested in fully exploring what the problem is
M2: Name Information Seeking Questions
Who? What? When? Where? Why? How?
M2: Business Intent
Context: What happened? Why are you interested? What is the problem or opportunity?
M2: What are the 3 Business Intents?
1) context
2) impacted segment
3) potential reasons
M2: Business Intent
Impacted Segment: When did it take place? Where did it happen? Who is impacted?
M2: Business Intent
Potential Reasons: What might have caused this? What do you think drives this?
M2: Business Considerations
Timelines: What decisions need to be taken and by When?
M2: What are the 3 Business Considerations?
1) timelines
2) stakeholders
3) actions
M2: Business Considerations
Stakeholder: Who is asking? Who is the decision maker? Who will take action?
M2: Business Considerations
Actions: What action are you going to take based on this analysis? Is this required one time (adhoc) vs. recurring (dashboard)?
Statistical analysis examples
-Correlation Analysis
-Trend Analysis
-Predictive Analytics (forecasting, liner regression, logistic regression, testing/experiments)
-Segmentation (between group comparisons)
Descriptive Analysis
-Categorization
-Identifying Patterns and Themes
What is this like?
Descriptive analysis examples
-Aggregate Analysis
-Trend Analysis
-Sizing/Estimation
-Segmentation
-Customer Life Cycle
Statistical Inference
-Identifying Relationships
-Determining Causality
Investigating the Why or What if?
M2: Identify questions that would require descriptive analysis vs. statistical analysis --
1. Why has conversion dropped postlaunch of a product?
Statistical Inference
M2: Identify questions that would require descriptive analysis vs. statistical analysis --
2. How many elementary schools exist in New York State?
Descriptive
M2: Identify questions that would require descriptive analysis vs. statistical analysis --
3. Determine if and why revenue growth for "Toys and All" has slowed down over the last few weeks?
Both Descriptive and/or Statistical Inference
M2: Identify questions that would require descriptive analysis vs. statistical analysis --
4. Can you tell me which offer worked best in the last marketing campaign?
Both Descriptive and/or Statistical Inference
M2: Identify questions that would require descriptive analysis vs. statistical analysis --
5. Are our London office employees younger than our Singapore office employees?
Descriptive
M2: Identify questions that would require descriptive analysis vs. statistical analysis --
6. What are the time cycles for our customers to go from hearing about us to downloading the free game and then paying for the premium features?
Descriptive
M2: Identify questions that would require descriptive analysis vs. statistical analysis --
7. Of our one million customers, to which 200K should I send the next marketing campaign to get the best ROI?
Both Descriptive and/or Statistical Inference
M2: Identify questions that would require descriptive analysis vs. statistical analysis --
8. What are the different use cases for which our customer is using our printers? What does it mean for us?
Descriptive
What is their behavior like?
Descriptive
What are their characteristics?
Descriptive
What are our sales like?
Descriptive
What is going on in the market enviroment?
Descriptive
Why do our customers behave this way?
Statistical Inference
Why are our sales going down?
Statistical Inference
What if?
Statistical Inference
M2: Identify three open-ended questions:
These types of questions prompt people to answer with sentences, lists, and stories. They give deeper and new insights.
1. What is your current understanding?
2. What have you considered?
3. What surprised you?
Close ended questions
limit answers, thus tighter stats.
Divergent questions are
open-ended questions that encourage creative thinking and have more than one possible answer
M2: Identify divergent questions: Go/No-go
What decision are you thinking about now?
M2: Identify divergent questions: Clarification
What do you mean?
M2: Identify divergent questions: Assumptions
What are your assumptions?
M2: Identify divergent questions: Foundational
How do we know this to be true?
M2: Identify divergent questions: Action
What could or should be done?
M2: Identify divergent questions: Cause
What is the context? Why did this happen?
M2: Identify divergent questions: Effect
What will be the impact or outcome of deciding?
M2: Describe the benefits and steps involved in IWIK questioning
-Clarifies priorities
-Uncovers essential information needed
-Identifies Knowledge gaps
-Defines assumptions
-Reveals Biases.
What are the steps for IWIK?
1) Preparing questions before data
2) Asking the right people
3) Assessing Needs
4) Working Backwards
5) Examples
M3: Identify the steps involved in developing an analysis plan (5 building blocks)
1) Analysis Goals (research objective)
2) Hypotheses
3) Methodology
4) Specify Data
5) Project Plan
What does the project plan ential?
-Resources
-Roles
-Timelines
-Risks
methodology
how we are collecting the data, where we are collecting it from and techniques to analyze it
M3: Describe the ideal characteristics of an analysis goal
-Specific
-Measurable
-Attainable
-Relevant
-Time bound
analysis goal
More Specific and more measurable. Determine and Define Research Objectives -It would lay out what you can answer directly with the data you have.
hypothesis
A scientific guess that proposes a relationship between two variables (e.g., "If x goes up, then y goes down").
how is a hypothesis generated
through brainstorming sessions
hypothesis testing is based upon
probable theory
hypothesis are usually stated as
supported or not supported
hypothesis are NOT stated as
proven or disproven
Independent Variables
Unknowns that may have a relationship with the dependent variable and no relationship with each other.
These are determined by the hypotheses developed to solve the business question
Independent Variables
Dependent Variables
A variable that is the object of the particular predictive analysis. It is determined by the business question that the model is designed to solve.
Is an example of a dependent variable
Conversion
Type 1 Error
Occurs when sample data suggests that a relationship does exist when in fact a relationship does not exist
Type 2 Error
Occurs when the sample data suggests that a relationship does not exist when in fact a relationship does exist
p-value
Probability value or the observed or computed significance level (0.1, 0.05, 0.01) p-values are compared to significance levels to test a hypothesis.
significance level
A critical probability value associated with a statistical hypothesis test that indicates how likely an inference supporting a difference between an observed value and some statistical expectation is true.
If the p-value resulting from a statistical test is _______ the prespecified significance level, the results support a hypothesis implying differences.
less than
What is the acceptable amount of error tradionally set by researcher?
Most typically, researchers set the acceptable amount of error, and therefore the acceptable significance level, at 0.1, 0.05, or 0.01.
To illustrate, if an analyst is comparing sales in two districts and sets the acceptable Type I error at 0.1 and the p-value resulting from the test is 0.03, then the results
support a hypothesis suggesting differences in sales in the two districts.
M3: Describe the steps involved in specifying the methodology
1) Determine level of granularity
2) Assign unique ID
3) Aggregate it.
methodology-data available
-Historical data
-Secondary data
methodology-data not available
-Observation
-Survey
-Experiment
Only begins once the complete analysis plan is agreed upon by the key stakeholders
methodolgy
Descriptive (summary statistics)
-Mean
-Median
-Mode
-Range
-Frequency
-Sample Size
-Measures of Variability
Correlation (r)
The statistical measure of the linear relationship between two or more metric variables, as represented by the correlation coefficient (r) with a value at or between +1 and −1.
M3: Identify questions that could be answered with a Correlation (r)
Look at variables that correlate with something that the business is trying to impact.
This analysis methodology is used most frequently to solve business problems related to understanding drivers of the business or an event (Best with Continuous variables).
A _______ can be used to identify whether a correlation is statistically significant by providing a p-value, allowing the analyst to determine if the hypothesis is supported or not.
t-test
M3: Identify questions that could be answered with a Cross-tabulation
Also known as contingency table analysis, is most often used to analyze categorical (nominal measurement scale) data.
_________ determine whether or not the two variables are independent.
Chi-square tests
Chi-square statistic
the primary statistic used for testing the statistical significance of the cross-tabulation table
M3: Linear Regression
Approach to model linear relationship between scalar dependent variable and one or more independent variables.
_______________________ usually applied towards Customer lifetime value, cost of acquisition. Can be used to predict change in an outcome.
Linear Regression
Logistic Regression
A special case of regression in which the dependent variable is not continuous. Instead, it is discrete, or categorical, and mostly binary (0/1).
_____________________ is commonly used when there are a number of independent decisions, or discrete actions, like churn and fraud prediction
Logistic Regression
Decision Tree
Graphically depicts what-if analyses in a tree-like diagram with branches indicating the chances of some event occurring.
Cluster Analysis
Partitioning technique that can classify a large set of heterogeneous observations into a small number of homogenous groups (clusters).
What are the two steps of data collection in the BADIR process?
-Data pull
-Data cleansing and validation