CH 4 Describing Data – Displaying and Exploring Data (Chapter 4)
1. Dot Plots (LO 4-1)
Definition: A dot plot is a graphical display where each data point is represented as a dot along a number line. Identical values are stacked.
Key Features:
Retains individual observation identity.
Useful for small datasets.
Easily identifies clusters and gaps.
Example: Comparing the number of vehicles serviced at two dealerships using dot plots.
2. Measures of Position (LO 4-2)
Standard Deviation: The most common measure of dispersion.
Alternative Measures:
Quartiles: Divide data into four equal parts.
Deciles: Divide data into ten equal parts.
Percentiles: Indicate the relative position of a value within the dataset.
Percentile Computation
Formula: Lp=(n+1)×P100Lp=(n+1)×100P
Example: Finding the 33rd percentile, median (50th percentile), and quartiles.
3. Box Plot (Box-and-Whisker Plot) (LO 4-3)
Definition: A graphical representation of data distribution based on five-number summary:
Minimum Value
Q1 (First Quartile, 25th Percentile)
Median (Q2, 50th Percentile)
Q3 (Third Quartile, 75th Percentile)
Maximum Value
Interquartile Range (IQR):
IQR=Q3−Q1IQR=Q3−Q1
Middle 50% of data falls within this range.
Outliers:
Mild outliers: Beyond 1.5 * IQR.
Extreme outliers: Beyond 3 * IQR.
Example: Creating a box plot for delivery times at Alexander’s Pizza.
4. Skewness (LO 4-4)
Definition: Measures the asymmetry of data distribution.
Types of Skewness:
Symmetric: Mean = Median = Mode.
Positively Skewed (Right Skewed): Mean > Median > Mode.
Negatively Skewed (Left Skewed): Mean < Median < Mode.
Bimodal: Two peaks in the distribution.
Skewness Coefficient (Pearson’s Estimate):
If skewness < -1 or > +1 → Highly skewed.
If skewness between -1 and -0.5 or 0.5 and 1 → Moderately skewed.
If skewness between -0.5 and +0.5 → Approximately symmetric.
Example: Computing skewness for earnings per share of software companies.
5. Scatter Diagrams (Bivariate Analysis) (LO 4-5)
Definition: Graphical representation of the relationship between two variables.
How to Construct:
X-axis: Independent variable.
Y-axis: Dependent variable.
Example: Analyzing the relationship between profit earned on vehicle sales and the buyer’s age.
6. Contingency Tables (Cross-tabulation of Data) (LO 4-6)
Definition: A table that summarizes two categorical variables.
Usage: Helps identify relationships between two variables.
Example: Comparing dealership profits by dealership location.
Key Insights from Contingency Table Analysis:
Proportion of sales above and below the median at different dealerships.
Identifying patterns in categorical data.
Flashcard Questions
Dot Plots (LO 4-1)
What is a dot plot?
What type of data is best suited for a dot plot?
What does a stacked dot plot indicate?
Measures of Position (LO 4-2)
What is the most widely used measure of dispersion?
How are quartiles different from percentiles?
How do you compute the location of a percentile?
What is the interquartile range (IQR)?
What does the median represent in a dataset?
Box Plot (LO 4-3)
What five statistics are needed to create a box plot?
What does the box in a box plot represent?
How are outliers represented in a box plot?
What is the formula to identify mild and extreme outliers?
How can a box plot show skewness in a dataset?
Skewness (LO 4-4)
What does skewness measure?
What does a positively skewed distribution look like?
What does a negatively skewed distribution look like?
How do we interpret a skewness coefficient of -0.3?
What does it mean if the mean is greater than the median?
Scatter Diagrams (LO 4-5)
What type of data is used in a scatter diagram?
What does a positive correlation in a scatter diagram indicate?
What does a negative correlation in a scatter diagram indicate?
How is a scatter plot different from a dot plot?
What are the axes in a scatter diagram used for?
Contingency Tables (LO 4-6)
What is a contingency table?
What types of data can be displayed in a contingency table?
How is a contingency table different from a scatter diagram?
What insights can be drawn from a contingency table?
How can contingency tables help in business decision-making?