Data and Decision Making
Measurement and Data
- Measurements and selling data are a significant business.
- Institutions like banks use spending data to create models for fraud prevention.
- Unusual purchase triggers a confirmation call.
- Proactive data usage has prevented substantial credit card fraud.
Appraising Quality of Measurements
- Crucial for making sound business decisions using statistics.
- What you measure and how you measure it is critical.
Procedure for Good Measurements
- Adequacy: Does the measurement adequately represent the concept?
- Accuracy: Is the data measured accurately?
- Sufficiency: Is there enough data to draw conclusions?
Well-Defined Concepts
- Some concepts are easy to define like speed, measured in miles per hour.
- Some concept are hard to define like intelligence, measuring it through IQ, but its meaning is debatable.
Example: Sales Price of a Textbook
- Well-defined because it can be measured in dollars, collected in large quantities, and measured accurately.
Example: Tall
- Not well-defined because it's relative and subjective.
Science and Data & The Scientific Method
- Gather information about the phenomenon.
- Formulate a preliminary generalization or hypothesis based on data.
- Collect further data to test the hypothesis.
- If supported, the hypothesis becomes a law; otherwise, revise the hypothesis and repeat.
- Statistics and data are fundamental to the scientific method.
- Carefully designed experiments supply the evidence to support or discredit new theories.
- Data collection is important for testing a hypothesis
- Data gathering strategy: Avoid confounding variable influences
Confounding Variables
- A variable not controlled or accounted for by the researcher, damaging experiment integrity.
Example: Marketing Plan and Increased Sales
- Ensure sales increase from the marketing plan, not other factors like increased sales force or reduced product cost.
Branches of Statistics
- Descriptive Statistics
- Focuses on exploratory methods for examining data to create hypothesis.
- Inferential Statistics
- Develops theories to test using data from an experiment to conclude about a population or parameter.
Making Decisions
- Clearly define the problem and influential variables.
- Decide upon objectives and decision criteria.
- Create alternative solutions.
- Compare alternatives using the established criteria.
- Implement the chosen alternative.
- Check the results.
- Define the problem! Any solution to the right problem is better than the best solution to the wrong problem.
Improving Processes
Example: Plastics Manufacturer
- Samples are taken hourly to check for defects.
- Data are plotted to see any potential production problems.
- Goal is to always improve the processes.
Plot Analysis
- Horizontal axis: Sample numbers (0-16).
- Vertical axis: Number of defects (0-14).
- Samples are represented, with sample 14 having the most defects.
Objectives and Criteria
- Defining objectives and developing criteria to evaluate alternative decisions.
Consumer Price Index (CPI)
- A summary statistic describing the overall price level.
- An economic measure of inflation that is used in labor contracts and to calculate cost of living increases in social security payments.
What/How to Measure
- If responsible for piston manufacturing, measure diameter, length, wall width, surface defects, etc.
- For automated stock trading, deciding variables to measure is less obvious.
Fuzzy Data and Concepts
- A precisely defined concept is usually easy to measure.
- The less precise the concept, the more difficult the measurement becomes.
- Measuring height is precise; measuring intelligence is fuzzy because of the definition.
- Fuzzy concept definitions create fuzzy measurements.
- It is assumed that everyone's interpretation of a concept is alike.
- The instrument used to measure a fuzzy concept ends up defining the concept.
- Poorly measured concepts are hard to fix.
- Question the data!
- Conclusions suggested by statistics can be no stronger than the quality of the measurements that produced the statistical evidence.
- Fuzzy measurements must create fragile conclusions.
Collecting Data
- Two methods: observation and controlled experiments.
- Method depends on problem and ethical/practical constraints.
Controlled Experiments
- Reveal the response of one variable to changes in another variable.
- The researcher controls the experiment's environment so that the effect of one variable on another can be isolated and measured.
Experiment Elements
- Control Group: Does not receive treatment.
- Experimental Group: Receives treatment.
- Treatment: The change to the explanatory variable.
Variable Definitions
- Response Variable: The variable of interest.
- Explanatory Variable: Affects the response variable.
Completely Randomized Design
- Experiment units are randomly assigned to different treatments.
Example: Statistics Before Finance
- Explanatory Variable: Whether or not a student takes statistics before finance.
- Response Variable: Student's grades in the finance course.
- Randomly select students, divide them into control and experimental groups.
- Control group (no statistics), experimental group (has statistics).
- Compare finance class results.
Before and After Study
- Each participant is in the control group
- A treatment is applied to the control group making it the experimental group, and another measurement is taken.
Studying for the SAT
- Take SAT, take prep course, then retake SAT.
- Explanatory variable: SAT prep course.
- Response variable: SAT scores.
- Compare student's performance on the SAT before and after the course.
Clinical Trials
- The control group is given a dummy or fake treatment called a placebo.
Placebo
- A fake treatment that contains none of the drug being tested.
Placebo Effect
- The belief that the subject improves or has a reaction to the placebo when they haven't actually received the treatment.
Double Blind Studies
- To counteract the placebo effect, The subjects and evaluators are unaware of group assignments.
- The gold standard for medical experiments.
Treatment of Ulcers example
- All participants swallowed a balloon with tubes attached
- Half of the balloons were filled with a refrigerated solution, while the others were filled with a non refrigerated solution
- The placebo was a solution filled balloon, whereas the actual treatment was the balloons being filled with the refrigerated solution
Observational Studies
- Data is collected and measured.
- No experimentation.
Examples
- Market data, government, sports, etc.
- No influence or interaction with the experimenter.
Value of Observational Studies
- Examine trends, even without designed experiments.
- There may be confounding variables for which there cannot be control.
UC Berkeley Discrimination Study
- A higher percentage of male applicants were admitted
- Women had higher acceptance rates in four of the six programs.
- An example of Simpson's paradox.
Simpson's Paradox
- An effect or trend appears in different data groups when considered separately.
- Disappears or reverses when combining the groups together
Surveys
- Purely descriptive; produces observational data.
- Belongs to the first two steps of the scientific method.
Running a Survey Steps
- Have specific goals.
- Consider alternatives for collecting data.
- Select samples to represent the population.
- Match question wording to the concepts being measured.
- Pretest the questionnaires.
- Construct quality checks.
- Use statistical analysis and reporting techniques.
- Disclose all methods used to conduct the survey.
Causal Factors
- Variables that can be controlled in a subsequent experiment.
- Factors or variables that influence the response variable.