CH 4 Describing Data – Displaying and Exploring Data (Chapter 4)

1. Dot Plots (LO 4-1)

  • Definition: A dot plot is a graphical display where each data point is represented as a dot along a number line. Identical values are stacked.

  • Key Features:

    • Retains individual observation identity.

    • Useful for small datasets.

    • Easily identifies clusters and gaps.

  • Example: Comparing the number of vehicles serviced at two dealerships using dot plots.

2. Measures of Position (LO 4-2)

  • Standard Deviation: The most common measure of dispersion.

  • Alternative Measures:

    • Quartiles: Divide data into four equal parts.

    • Deciles: Divide data into ten equal parts.

    • Percentiles: Indicate the relative position of a value within the dataset.

Percentile Computation
  • Formula: Lp=(n+1)×P100Lp​=(n+1)×100P​

  • Example: Finding the 33rd percentile, median (50th percentile), and quartiles.

3. Box Plot (Box-and-Whisker Plot) (LO 4-3)

  • Definition: A graphical representation of data distribution based on five-number summary:

    1. Minimum Value

    2. Q1 (First Quartile, 25th Percentile)

    3. Median (Q2, 50th Percentile)

    4. Q3 (Third Quartile, 75th Percentile)

    5. Maximum Value

  • Interquartile Range (IQR):

    • IQR=Q3−Q1IQR=Q3−Q1

    • Middle 50% of data falls within this range.

  • Outliers:

    • Mild outliers: Beyond 1.5 * IQR.

    • Extreme outliers: Beyond 3 * IQR.

  • Example: Creating a box plot for delivery times at Alexander’s Pizza.

4. Skewness (LO 4-4)

  • Definition: Measures the asymmetry of data distribution.

  • Types of Skewness:

    • Symmetric: Mean = Median = Mode.

    • Positively Skewed (Right Skewed): Mean > Median > Mode.

    • Negatively Skewed (Left Skewed): Mean < Median < Mode.

    • Bimodal: Two peaks in the distribution.

  • Skewness Coefficient (Pearson’s Estimate):

    • If skewness < -1 or > +1 → Highly skewed.

    • If skewness between -1 and -0.5 or 0.5 and 1 → Moderately skewed.

    • If skewness between -0.5 and +0.5 → Approximately symmetric.

  • Example: Computing skewness for earnings per share of software companies.

5. Scatter Diagrams (Bivariate Analysis) (LO 4-5)

  • Definition: Graphical representation of the relationship between two variables.

  • How to Construct:

    • X-axis: Independent variable.

    • Y-axis: Dependent variable.

  • Example: Analyzing the relationship between profit earned on vehicle sales and the buyer’s age.

6. Contingency Tables (Cross-tabulation of Data) (LO 4-6)

  • Definition: A table that summarizes two categorical variables.

  • Usage: Helps identify relationships between two variables.

  • Example: Comparing dealership profits by dealership location.

  • Key Insights from Contingency Table Analysis:

    • Proportion of sales above and below the median at different dealerships.

    • Identifying patterns in categorical data.


Flashcard Questions

Dot Plots (LO 4-1)

  1. What is a dot plot?

  2. What type of data is best suited for a dot plot?

  3. What does a stacked dot plot indicate?

Measures of Position (LO 4-2)

  1. What is the most widely used measure of dispersion?

  2. How are quartiles different from percentiles?

  3. How do you compute the location of a percentile?

  4. What is the interquartile range (IQR)?

  5. What does the median represent in a dataset?

Box Plot (LO 4-3)

  1. What five statistics are needed to create a box plot?

  2. What does the box in a box plot represent?

  3. How are outliers represented in a box plot?

  4. What is the formula to identify mild and extreme outliers?

  5. How can a box plot show skewness in a dataset?

Skewness (LO 4-4)

  1. What does skewness measure?

  2. What does a positively skewed distribution look like?

  3. What does a negatively skewed distribution look like?

  4. How do we interpret a skewness coefficient of -0.3?

  5. What does it mean if the mean is greater than the median?

Scatter Diagrams (LO 4-5)

  1. What type of data is used in a scatter diagram?

  2. What does a positive correlation in a scatter diagram indicate?

  3. What does a negative correlation in a scatter diagram indicate?

  4. How is a scatter plot different from a dot plot?

  5. What are the axes in a scatter diagram used for?

Contingency Tables (LO 4-6)

  1. What is a contingency table?

  2. What types of data can be displayed in a contingency table?

  3. How is a contingency table different from a scatter diagram?

  4. What insights can be drawn from a contingency table?

  5. How can contingency tables help in business decision-making?