Stem and Leaf displays

Definition: A stem and leaf display is a graphical method of displaying data that is particularly useful when the dataset is not overly large.
Purpose: It provides an intuitive way to visualize the distribution, shape, and individual values of a dataset.

Initial Example: The example used is based on the number of touchdown passes thrown by the 31 NFL teams in the 2000 season.
- Display Components:
- Stems: The left portion contains the stems which represent the tens digits. These are arranged in a column on the left. In this example, the stems are:
  - 3 (representing 30-39)
  - 2 (representing 20-29)
  - 1 (representing 10-19)
  - 0 (representing 0-9)
- Leaves: The leaves on the right side of the stems represent the ones digits, contributing to identifying the exact values in the dataset.
Example Interpretation:
- Top Row: Stem of 3 corresponds to leaves 2, 3, 3, 7. Thus,
- Values represented are 32, 33, 33, 37 touchdowns for the first four teams.
- Second Row: Stem of 2 with 12 leaves representing:
- 2 occurrences of 20 touchdowns
- 3 occurrences of 21 touchdowns
- 3 occurrences of 22 touchdowns
- 1 occurrence of 23 touchdowns
- 2 occurrences of 28 touchdowns
- 1 occurrence of 29 touchdowns.
- Third Row: To be interpreted by students as a task.
- Fourth Row: Stem of 0 with leaves 9 and 6 representing the last two entries (09 and 06 touchdowns).
Key Observations:
- A stem and leaf display clarifies the shape of data distributions effectively.
- It allows viewers to readily identify ranges, trends, and counts.
- Example conclusions drawn include:
- Most teams scored between 10 and 29 passing touchdowns, with fewer teams having scores higher or lower than this range.

Splitting Stems:
- This technique is employed to make clearer graphs when single stems contain multiple values.
- The enhanced display divides the figures into smaller segments.
- Example: The range 35-39 is shown separately to highlight specific data points.
- Effectiveness: Splitting stems can lead to more intelligible displays as it prevents excessive data from being lumped into one category.
Back-to-Back Stem and Leaf Displays:
- This variation allows for comparison between two distributions by placing them along a common column of stems.
- Example Used: Comparing touchdown passes from 1998 and 2000 seasons.
- Each stem serves as the reference with leaves on either side showing historical data.
- Specific observations can be drawn, detailing performance changes between the seasons.

Characteristics of Data for Suitable Stem and Leaf Displays:
- Whole numbers are preferred, ideally allowing representation with one-digit stems and leaves.
- All values should be positive to maintain the format.
- If decimal points or large numbers are present, data should be suitably rounded (to two-digit accuracy preferred).

Data on Aggressive Thinking:
- Context: Study on the speed of naming aggressive words when primed by either a weapon or non-weapon word.
- Result Interpretation:
- Positive differences indicate faster pronunciation with weapon words.
- Negative differences indicate slower pronunciation.
- Example Values: Range from 43.2 milliseconds faster to -27.4 milliseconds slower, illustrating the apparent differences in speed.
Negative and Zero Handling:
- Examples derived from the aggressive thinking study utilize negative stems for interpreting negative values.
- Special zero handling allows value distributions to be depicted correctly between zero and negative numbers:
- Zero stem for numbers 0-9.
- Negative zero stem for numbers between 0 and -9.

Data Size:
- Optimal for datasets of up to 200 observations.
Population Dataset Example:
- Observations of populations from 185 US cities in 1998, rounded to the nearest 10,000 residents, plotted appropriately.
Judgment in Graphing Choice:
- Assessing whether the dataset can be aptly represented in stem and leaf format is crucial. Some datasets may lose important details if rounded excessively.
- The effectiveness of a statistical representation relies on good judgment and understanding individual datasets' nature.