Business Statistics: Data and Data Preparation

Business Statistics

Chapter 1: Data and Data Preparation

  • Date: 1/22/26

  • Institution: Wichita State University

  • Semester: Spring 2026

Definition of Data and Statistics

  • Data: Can be classified as numerical or non-numerical.

  • Statistics: Defined as the science that deals with the collection, preparation, analysis, presentation, and interpretation of data.

Steps to Good Statistical Analysis

  1. The RIGHT Data: Importance of not just collecting good data but ensuring it is relevant and appropriate for analysis.

  2. Choosing the Appropriate Technique: Selecting the right method for analyzing the specific data type being utilized.

  3. Clear Communication of Results: Effectively visualizing and verbally articulating findings from the statistical analysis.

Terminology in Statistics

  • Two main branches of statistics:

    • Descriptive Statistics: Concerns methods for summarizing and organizing data.

    • Inferential Statistics: Involves techniques that allow us to generalize findings from a sample to a population.

  • Key distinctions:

    • Sample vs. Population: A sample is a subset of a population used for analysis, while a population includes all members of a specified group.

    • Statistic vs. Parameter: A statistic is a numerical characteristic of a sample (e.g., sample mean), whereas a parameter is a numerical characteristic of a population (e.g., population mean).

Types of Data

  • Qualitative (or Categorical): Descriptive data that can be classified into categories.

  • Quantitative: Numeric data that can be measured.

    • Discrete Data: Countable quantities (e.g., the number of cars in a parking lot).

    • Continuous Data: Data that can take any value within a range (e.g., measurements like time or weight).

Levels of Measurement

  1. Nominal:

    • Classification into distinct categories; no inherent order.

    • Examples:

      • Department Name: Marketing, Sales, HR, Accounting

      • Customer Gender: Male, Female, Non-binary

      • Payment Method: Cash, Credit, PayPal, Apple Pay

      • Product Category: Electronics, Clothing, Groceries

      • Store Location Code: North, South, East, West

  2. Ordinal:

    • Classification into categories with a meaningful order but unequal spacing between categories.

    • Examples:

      • Customer Satisfaction Rating: Very Unsatisfied to Very Satisfied

      • Employee Job Level: Entry, Associate, Manager, Director

      • Survey Agreement Scale: Strongly Disagree to Strongly Agree

      • Credit Rating: Excellent, Good, Fair, Poor

      • Product Review Star Rating: 1 star to 5 stars

  3. Interval:

    • Numeric scale with equal intervals between values but no true zero.

    • Examples:

      • Temperature in Fahrenheit or Celsius (e.g., 60°F, 75°F; 0 ≠ absence of temperature)

      • SAT Scores (Meaningful difference between scores, but no true zero value)

      • Calendar Years (1990, 2000, 2020 — no true "year zero")

      • Time of Day on a 12-hour clock (3 PM, 6 PM – circular; no absolute zero)

      • IQ Scores (Not meaningful to claim an IQ of 0 indicates no intelligence)

  4. Ratio:

    • Numeric data with equal intervals and a true zero point.

    • Examples:

      • Revenue ($): $0 means no revenue

      • Inventory Count: 0 items = none in stock

      • Distance Traveled (miles/km): 0 miles = no distance

      • Time to Complete a Task (minutes): 0 minutes = no time spent

      • Age of a Customer: 0 years = newborn

Importance of Levels of Measurement

  1. Defines What We Can Do with the Data:

    • Nominal: Count frequencies or use percentages (e.g., eye color, gender). Means and medians are nonsensical.

    • Ordinal: Rank and compare orders (e.g., class rank, satisfaction ratings), no assumption of equal spacing.

    • Interval: Allows addition and subtraction, calculation of means, and measurement of differences (e.g., temperature). Ratios are not meaningful due to an arbitrary zero.

    • Ratio: Supports all arithmetic operations including meaningful ratios (e.g., height, income).

  2. Prevention of Statistical Errors:

    • Treating nominal data as numeric (e.g., averaging