Data and Decision Making

Measurement and Data

  • Measurements and selling data are a significant business.
  • Institutions like banks use spending data to create models for fraud prevention.
    • Unusual purchase triggers a confirmation call.
    • Proactive data usage has prevented substantial credit card fraud.

Appraising Quality of Measurements

  • Crucial for making sound business decisions using statistics.
  • What you measure and how you measure it is critical.
Procedure for Good Measurements
  1. Adequacy: Does the measurement adequately represent the concept?
  2. Accuracy: Is the data measured accurately?
  3. Sufficiency: Is there enough data to draw conclusions?
Well-Defined Concepts
  • Some concepts are easy to define like speed, measured in miles per hour.
  • Some concept are hard to define like intelligence, measuring it through IQ, but its meaning is debatable.
Example: Sales Price of a Textbook
  • Well-defined because it can be measured in dollars, collected in large quantities, and measured accurately.
Example: Tall
  • Not well-defined because it's relative and subjective.

Science and Data & The Scientific Method

  1. Gather information about the phenomenon.
  2. Formulate a preliminary generalization or hypothesis based on data.
  3. Collect further data to test the hypothesis.
  4. If supported, the hypothesis becomes a law; otherwise, revise the hypothesis and repeat.
  • Statistics and data are fundamental to the scientific method.
  • Carefully designed experiments supply the evidence to support or discredit new theories.
  • Data collection is important for testing a hypothesis
  • Data gathering strategy: Avoid confounding variable influences

Confounding Variables

  • A variable not controlled or accounted for by the researcher, damaging experiment integrity.
Example: Marketing Plan and Increased Sales
  • Ensure sales increase from the marketing plan, not other factors like increased sales force or reduced product cost.

Branches of Statistics

  1. Descriptive Statistics
    • Focuses on exploratory methods for examining data to create hypothesis.
  2. Inferential Statistics
    • Develops theories to test using data from an experiment to conclude about a population or parameter.

Making Decisions

  1. Clearly define the problem and influential variables.
  2. Decide upon objectives and decision criteria.
  3. Create alternative solutions.
  4. Compare alternatives using the established criteria.
  5. Implement the chosen alternative.
  6. Check the results.
  • Define the problem! Any solution to the right problem is better than the best solution to the wrong problem.

Improving Processes

Example: Plastics Manufacturer
  • Samples are taken hourly to check for defects.
  • Data are plotted to see any potential production problems.
  • Goal is to always improve the processes.
Plot Analysis
  • Horizontal axis: Sample numbers (0-16).
  • Vertical axis: Number of defects (0-14).
  • Samples are represented, with sample 14 having the most defects.

Objectives and Criteria

  • Defining objectives and developing criteria to evaluate alternative decisions.
Consumer Price Index (CPI)
  • A summary statistic describing the overall price level.
  • An economic measure of inflation that is used in labor contracts and to calculate cost of living increases in social security payments.

What/How to Measure

  • If responsible for piston manufacturing, measure diameter, length, wall width, surface defects, etc.
  • For automated stock trading, deciding variables to measure is less obvious.

Fuzzy Data and Concepts

  • A precisely defined concept is usually easy to measure.
  • The less precise the concept, the more difficult the measurement becomes.
  • Measuring height is precise; measuring intelligence is fuzzy because of the definition.
  • Fuzzy concept definitions create fuzzy measurements.
  • It is assumed that everyone's interpretation of a concept is alike.
  • The instrument used to measure a fuzzy concept ends up defining the concept.
  • Poorly measured concepts are hard to fix.
  • Question the data!
  • Conclusions suggested by statistics can be no stronger than the quality of the measurements that produced the statistical evidence.
  • Fuzzy measurements must create fragile conclusions.

Collecting Data

  • Two methods: observation and controlled experiments.
  • Method depends on problem and ethical/practical constraints.

Controlled Experiments

  • Reveal the response of one variable to changes in another variable.
  • The researcher controls the experiment's environment so that the effect of one variable on another can be isolated and measured.
Experiment Elements
  • Control Group: Does not receive treatment.
  • Experimental Group: Receives treatment.
  • Treatment: The change to the explanatory variable.
Variable Definitions
  • Response Variable: The variable of interest.
  • Explanatory Variable: Affects the response variable.

Completely Randomized Design

  • Experiment units are randomly assigned to different treatments.
Example: Statistics Before Finance
  • Explanatory Variable: Whether or not a student takes statistics before finance.
  • Response Variable: Student's grades in the finance course.
  • Randomly select students, divide them into control and experimental groups.
  • Control group (no statistics), experimental group (has statistics).
  • Compare finance class results.

Before and After Study

  • Each participant is in the control group
  • A treatment is applied to the control group making it the experimental group, and another measurement is taken.
Studying for the SAT
  • Take SAT, take prep course, then retake SAT.
  • Explanatory variable: SAT prep course.
  • Response variable: SAT scores.
  • Compare student's performance on the SAT before and after the course.

Clinical Trials

  • The control group is given a dummy or fake treatment called a placebo.
Placebo
  • A fake treatment that contains none of the drug being tested.
Placebo Effect
  • The belief that the subject improves or has a reaction to the placebo when they haven't actually received the treatment.
Double Blind Studies
  • To counteract the placebo effect, The subjects and evaluators are unaware of group assignments.
  • The gold standard for medical experiments.
Treatment of Ulcers example
  • All participants swallowed a balloon with tubes attached
  • Half of the balloons were filled with a refrigerated solution, while the others were filled with a non refrigerated solution
  • The placebo was a solution filled balloon, whereas the actual treatment was the balloons being filled with the refrigerated solution

Observational Studies

  • Data is collected and measured.
  • No experimentation.
Examples
  • Market data, government, sports, etc.
  • No influence or interaction with the experimenter.

Value of Observational Studies

  • Examine trends, even without designed experiments.
  • There may be confounding variables for which there cannot be control.
UC Berkeley Discrimination Study
  • A higher percentage of male applicants were admitted
  • Women had higher acceptance rates in four of the six programs.
  • An example of Simpson's paradox.
Simpson's Paradox
  • An effect or trend appears in different data groups when considered separately.
  • Disappears or reverses when combining the groups together

Surveys

  • Purely descriptive; produces observational data.
  • Belongs to the first two steps of the scientific method.
Running a Survey Steps
  1. Have specific goals.
  2. Consider alternatives for collecting data.
  3. Select samples to represent the population.
  4. Match question wording to the concepts being measured.
  5. Pretest the questionnaires.
  6. Construct quality checks.
  7. Use statistical analysis and reporting techniques.
  8. Disclose all methods used to conduct the survey.

Causal Factors

  • Variables that can be controlled in a subsequent experiment.
  • Factors or variables that influence the response variable.