CA

Stats Chapter 2

Chapter Overview: Statistics for Managers Using Microsoft® Excel®

Introduction

  • Author: David M. Levine, David F. Stephan, Kathryn A. Szabat, Marking Bulger

  • Subject: Overview of organizing and visualizing data using Excel, including strategies for both categorical and numerical variables.

Objectives of Chapter 2

  • Understand how to:

    • Organize and visualize categorical variables.

    • Organize and visualize numerical variables.

    • Summarize a mix of variables.

    • Avoid common errors in data organization and visualization.

Organizing and Visualizing Data

Importance of Summarization

  • Tabular Summaries: Helps you explore data more and make decisions easier

  • Visual Summaries: Quickly spot data and see trends

  • DCOVA Process: Organize and visualize steps

Organizing Categorical Data

Types of Data Tables

  • Summary Table:

    • Tallies frequencies or percentages across categories.

    • Example: Devices used to watch TV shows:

      • Television Set: 49%

      • Tablet: 9%

      • Smartphone: 10%

      • Laptop/Desktop: 32%

Contingency Table

  • Purpose: Organizes two or more categorical variables to study relationships.

  • Structure: Rows represent one categorical variable and columns represent another.

  • Example Data: Examines invoice errors categorized by size (small, medium, large):

    • Small Amount: 170 No Errors, 20 Errors

    • Total: 190

Data Analysis with Contingency Tables

Frequency Analysis

  • No Errors vs. Errors Data:

    • Small Amounts: 42.50% no errors, 5.00% errors.

    • Medium Amounts: 25.00% no errors, 10.00% errors.

    • Large Amounts: 16.25% no errors, 1.25% errors.

Percentage Breakdown

  • Row Totals: Useful for highlighting error likelihood:

    • Medium invoices have a higher error rate (28.57%) compared to small (10.53%).

  • Column Totals: Shows proportion of errors:

    • 61.54% of invoices with errors are medium sized.

Organizing Numerical Data

Ordered Array

  • Definition: A sequence of data in rank order (smallest to largest).

  • Purpose: Indicates range and helps identify outliers.

Frequency Distribution

  • Structure: Arrangements in ordered classes, considers:

    • Class groupings (5-15 classes recommended).

    • Class boundaries.

    • Width of class intervals.

Example of Frequency Distribution

  • Raw data sorted (e.g., temperatures) and examined for patterns:

    • Compute class boundaries and midpoints for analysis.

Visualizing Categorical Data

Graphical Displays

  • Bar Chart: Represents categorical data; bars indicate frequency or percentage.

  • Pie Chart: Illustrates percentage of categories within a whole.

  • Doughnut Chart: Similar to pie, but with a central hole.

  • Pareto Chart: Displays categories in descending order of frequency with a cumulative line.

Visualizing Numerical Data

Histogram

  • Definition: Vertical bar chart showing frequency distribution without gaps.

  • Axes: Class boundaries on horizontal, frequency on vertical.

Percentage Polygon

  • Purpose: Connects class midpoints to visualize percentages over time, useful for comparisons.

Visualizing Relationships in Data

Scatter Plots

  • Definition: Displays paired observations of two numerical variables; assesses relationships.

Time-Series Plot

  • Usage: Analyze data patterns over time.

Best Practices for Data Visualization

  • Utilize simple visualizations; ensure clarity through proper labeling.

  • Start vertical axes at zero and maintain consistent scales.

  • Avoid complex or 3D chart types that can confuse data interpretation.

Common Pitfalls in Data Presentation

  • Presentation issues leading to obscured data or false impressions.

  • Importance of proper scaling and avoiding chart junk to enhance usefulness.

Chapter Summary

  • Focus on organizing and visualizing both categorical and numerical variables.

  • Cover strategies for summarizing mixed variables and avoiding visualization errors.