Displaying and Summarizing Quantitative Data

Module 2 - Section 2

Displaying and Summarizing Quantitative Data

Introduction to Quantitative Data

  • Quantitative variables often take many values, exemplified by the prices of walking shoes:

    • Example Prices: 90, 70, 70, 70, 75, 70, 65, 68, 60, 74, 70, 95, 75, 68, 85, 40, 65

  • The need for visual representations arises to display the data and illustrate the distribution effectively.

Types of Graphs for Numerical Data

  • Categorical Variables:

    • Bar Chart

    • Pie Chart

  • Numerical Variables:

    • Dot Plots

    • Stem Plots

    • Histograms

    • Time Plots

    • Box Plots

    • Scatterplots

Dot Plots

  • Definition: A dot plot is a graphical display that portrays individual observations.

  • Construction Steps:

    1. Draw a horizontal (or vertical) line.

    2. Label the line with the name of the variable and mark the regular values of the variable on it.

    3. For each observation, place a dot above (or next to) its value on the number line.

  • Important Notes:

    • The number of dots above a value indicates the frequency of occurrence of that value.

    • Dot plots are more effective for small datasets (n ≤ 50).

Example: Dotplot for Prices of Walking Shoes

  • Data for 17 walking shoes:** 90, 70, 70, 70, 75, 70, 65, 68, 60, 74, 70, 95, 75, 68, 85, 40, 65**

  • Dotplot created using the data above displays frequency and distribution.

Describing a Distribution

  • Important aspects to describe a plot include:

    • Shapes: Determine the nature of the distribution (unimodal, bimodal, multimodal).

    • Modes: Characterization based on the number of humps or peaks in the distribution.

    • Symmetry or Skewness: Aspects related to the balance of the distribution.

    • Deviations or Outliers: Identifying unusual values that deviate from the overall pattern.

    • Center: Refers to the value that divides the data in half, indicating a typical range.

    • Spread: Measures the range of values and the concentration around the center.

Shapes of Distributions

  • Unimodal: One peak.

  • Bimodal: Two peaks; may indicate two different groups within the data.

  • Multimodal: More than two peaks; rarely occurs.

  • Uniform: No clear modes (flat distribution).

Skewness in Data

  • Symmetric: If a vertical line can divide the graph into mirror images on either side.

  • Positively Skewed: Histogram stretches out more to the right; the upper tail is longer.

  • Negatively Skewed: Histogram stretches out more to the left; the lower tail is longer.

  • Outlier: An observation that falls outside the overall pattern.

Center and Spread of Data

  • Center: Value that splits the data in half, indicating typical values within the dataset.

    • Measures of central tendency include mean, median, and mode.

  • Spread: Refers to the variation within the data.

    • Important measures include range, standard deviation, and interquartile range (IQR).

Stem-and-Leaf Displays (Stemplots)

  • Definition: A stemplot or stem-and-leaf display effectively shows individual observations of quantitative data; suitable for small datasets (n ≤ 50).

  • Construction Steps:

    1. Order the data from smallest to largest.

    2. Divide each number into two parts: the stem (leading digits) and the leaf (last digit).

    3. Label the bins using the stem.

    4. List stems in a column and record leaf portions in the corresponding rows.

    5. Provide a key to decode the stemplot.

Example: Stemplot for Prices of Walking Shoes

  • Ordered Prices: 40, 60, 65, 65, 68, 68, 70, 70, 70, 70, 70, 74, 75, 75, 85, 90, 95

  • Stem and Leaf Representation:

    • Stem: 4, 5, 6, 7, 8, 9

    • Leaves: 0, 0, 5, 5, 8, 8, 0, 0, 0, 0, 0, 4, 5, 5, 5, 0, 5

  • Notes that the leaf '5' represents $75.

Back-to-back Stemplot

  • Definition: This is effective for comparing the distribution of two groups side by side.

  • Construction: Use the same principles as stemplots but position one group's leaves on one side and the other group's leaves on the opposite.

Histogram

  • Definition: The most common graph for depicting numerical data which visualizes the distribution of an underlying variable.

  • Description: Utilizes bars to represent frequency or relative frequency of measurements falling within specified equal-width intervals (bins).

Constructing a Histogram

  • Steps to create:

    1. Decide on intervals of equal length (bins) for data.

    2. Use the left-inclusive method for class intervals; this determines where boundary values fall.

    3. Create a frequency table for intervals.

    4. Mark interval boundaries on the horizontal axis and frequency/relative frequency on the vertical axis.

    5. Draw the bars to represent the class intervals with heights according to frequency or relative frequency.

Example: Histogram for Prices of Walking Shoes

  • Data for Analysis: Prices of walking shoes include: 40, 60, 65, 65, 68, 68, 70, 70, 70, 70, 70, 74, 75, 75, 85, 90, 95

  • Use a bin width of 10 for the histogram.

  • Totals from frequency table reveal distribution:

    • Class Intervals: [40; 50), [50; 60), [60; 70), [70; 80), [80; 90), [90; 100)

    • Frequency values corresponding to each interval.

Relative Frequency and Proportions

  • Important calculations:

    • Determine the proportion of walking shoe prices falling on or above $70.

    • Percent of prices falling below $70 calculated using the frequency table derived from histogram.

Frequency Table Example:

Class Intervals

Frequency

Relative Frequency

[40; 50)

1

5.88%

[50; 60)

0

0%

[60; 70)

5

29.41%

[70; 80)

8

47.06%

[80; 90)

1

5.88%

[90; 100)

2

11.76%

Total

17

100.00%

Important Notes

  • Data values are clearly retained with dot plots and stem-and-leaf plots but are lost in histograms.

  • In bar charts (for categorical data), spaces between bars indicate distinct counts of categories.

  • Conversely, in histograms, gaps indicate regions without data points, enhancing visualization of distribution gaps.

Conclusion

  • Summary of methods to visualize and summarize quantitative data includes dot plots, stem-and-leaf displays, and histograms.

Acknowledgment

  • Thank you for watching this video!