Histogram Explained
Visualizing Data Distribution: Histograms
Introduction
- The goal is to visualize the distribution of ages in a restaurant to understand the demographic makeup (young, teenage, middle-aged, senior).
- Raw numbers don't provide a clear sense of the distribution.
Buckets/Bins
- A method to organize data is to group ages into buckets or bins.
- Count the number of people in each bucket.
Example: Age Buckets
- Buckets are defined in ten-year ranges:
- 0-9
- 10-19
- 20-29
- 30-39
- 40-49
- 50-59
- 60-69
Counting People in Each Bucket
- 0-9: 6 people
- 10-19: 3 people
- 20-29: 5 people
- 30-39: 1 person
- 40-49: 2 people
- 50-59: 2 people
- 60-69: 1 person
Histograms
- A histogram is a visualization that uses the data, puts them into categories, and then plots how many folks are in each category.
Creating the Histogram
- X-axis: Buckets (age ranges)
- Y-axis: Number of people (frequency)
Plotting the Data
- 0-9: Bar extends to 6 on the number of folks axis.
- 10-19: Bar extends to 3.
- 20-29: Bar extends to 5.
- 30-39: Bar extends to 1.
- 40-49: Bar extends to 2.
- 50-59: Bar extends to 2.
- 60-69: Bar extends to 1.
Interpreting the Histogram
- The histogram provides a visual sense of the age distribution in the restaurant.
- Example: A restaurant that gives away toys might have more younger people.
- It helps to see trends like many young adults with kids or grandparents bringing children.
- In this instance, the restaurant appears to have a lot of kids but few senior citizens.
General Applicability
- Histograms can be applied to various types of data, not just ages, to visualize distributions.
Comparison with Dot Plots
- Unlike dot plots, which plot individual data points, histograms group data into buckets.
- If using a dot plot, there won't be much information.
- Histograms are useful when individual data points don't provide much information on their own.
- Histograms provide a summary of how many values fall within a given range.