Modeling Distributions of Quantitative Data
Chapter 2: Modeling Distributions of Quantitative Data
Introduction to Percentiles and Location in a Distribution
- The performance of students in tests is better understood by comparing scores rather than looking at absolute values.
- Example: Emily scores 43 out of 50 on a statistics test. Her satisfaction depends on the relative standing against classmates.
- Key Concepts Addressed in Section 2.1:
- Describing location using percentiles.
- New graphical representation for percentiles (cumulative relative frequency graph).
- Understanding individual performance based on mean and standard deviation.
- Density Curves: Provide visual estimations of individual locations within a distribution, particularly with data fitting a Normal distribution pattern.
Activity for Understanding Height Distribution
- Teacher marks a floor number line (height scale from 58 to 78 inches).
- Class stands according to their height, creating a human dot plot.
- Teacher records height distribution and displays for reference.
- Class discussion on percentiles of individual heights, mean, and standard deviation computations based on this dot plot:
- How many students have heights below their own?
- Calculate mean and standard deviation for class height, confirming with peers.
- Discuss height's position relative to the mean:
- Standardized scores (z-scores) provide insights into how far above or below the mean a height lies.
- Discussion on unit transformation effects (inches to centimeters):
- There are 2.54 cm in 1 inch, altering shape, center, variability, and measurements like percentiles and z-scores.
Section 2.1: Describing Location in a Distribution
Learning Targets for Section 2.1
- Locate an individual value within a distribution using percentiles.
- Use cumulative relative frequency graphs for estimating percentiles.
- Understand and calculate standardized scores (z-scores).
- Analyze how adding, subtracting, or scaling data affects distribution characteristics.
Measuring Location: Percentiles
- Definition of Percentile:
- The p-th percentile is the value below which a given percentage (p%) of observations fall.
- For example, if Emily's score (43) is at the 84th percentile, it means 84% of her classmates scored less than or equal to 43.
- Caution:
- An observation is said to be at a certain percentile not in it (i.e., Emily is at the 84th percentile, not in it).
Example Calculation of Percentiles
- Data Set of 25 Test Scores from Mr. Tabor's Class:
- Scores: 35, 18, 37, 38, 42, 41, 25, 37, 36, 32, 12, 43, 31, 29, 32, 48, 44, 45, 38, 40, 45, 38, 38, 40, 22.
- Total Students = 25
- Calculation Example:
- Jacob's score of 18 → Percentile = (2/25) = 0.08 or 8th percentile.
- Maria's 48th percentile indicates 48% scored less than her (implies score > 12 students).
Understanding Quartiles
- The three quartiles divide the data into groups:
- Q1 separates the lowest 25%, Q2 represents the median (50%), and Q3 separates the lowest 75%.
- Only in large datasets, the concept of percentiles holds more significance.
- Caution: A high percentile isn’t inherently positive (e.g., high cholesterol at the 90th percentile).
Cumulative Relative Frequency Graphs
- These graphs display decimal values representing cumulative percentages.
- Allows visual identification of individual values' positions and percentiles.
- Example: The age of U.S. Presidents when inaugurated illustrates age distribution:
- Frequency Table of ages,
| Age Range | Frequency | Cumulative Frequency |
| --------- | --------- | -------------------- |
| 40-45 | 2 | 2 |
| 45-50 | 7 | 9 |
| 50-55 | 13 | 22 | - Cumulative relative frequency helps to find percentiles visually in the graph.
- Definition of Cumulative Relative Frequency Graph: A graph that plots points corresponding to the cumulative percentage of observations.
Example Analysis Using Cumulative Graph
- Determine the Address of President Obama:
- Estimate % cumulative relative frequency for age 47.463.
- Interpretation shows he was slightly unusual age-wise at inauguration (12th percentile).
- Estimation of the 65th Percentile:
- Approx. corresponds to age 58 by averaging method.
Standardized Scores (z-scores)
- Definition: A z-score describes how many standard deviations a value is from the mean.
- Formula:
z = \frac{value - mean}{standard deviation} - Example: Emily scored 43; calculation shows she is:
z = \frac{43 - 35.44}{8.77} = 0.86 - Interpretation: Indicates Emily's score is 0.86 standard deviations above the mean.
- Addition/Subtraction:
- Adding or subtracting constants affects central measures (mean, median) but doesn't change variability.
- Example: Adding 5 points to all scores shifts measures of center but keeps the same distribution shape.
- Multiplication/Division:
- Changes both central measures and variability proportional to the multiplier/divisor (not shape).
- Visual representation and calculations confirm these properties, shifting observations while maintaining relative positioning.
Section Summary
- Percentiles and z-scores are vital for contextual understanding of individual data points.
- Use cumulative relative frequency graphs for identifying distribution characteristics.
- Transformation tactics can be employed for uniform data scaling without altering overall distribution shape.
Exercises and Practice
- Calculation of z-scores and understanding percentiles in various contexts (e.g., height, scores, income).
- Practical applications of cumulative frequency and transformation effects on data distributions across various disciplines.