Describing Graphed Distributions
Describing Graphed Distributions
Types of Graphical Representations
Stem-and-Leaf Plots: A device for displaying quantitative data where each data value is split into a "stem" (the leading digit) and a "leaf" (the trailing digit).
Histograms: A graphical representation showing the frequency distribution of a dataset by using bars.
Polygons: A line graph representing the frequency of data points, connected at the midpoints of the histogram's bars.
Aspects of Graphed Distributions
Central Tendency: A measure that represents the center or typical value of a dataset. Common measures include mean, median, and mode.
Dispersion (Variability): The extent to which data points in a dataset differ from each other and from the mean. Common measures include range, variance, and standard deviation.
Skewness: A measure of the asymmetry of the probability distribution.
Positive skewness indicates that the tail on the right side of the distribution is longer or fatter than the left.
Negative skewness indicates that the tail on the left side is longer or fatter than the right.
Kurtosis: Refers to the "peakedness" of a distribution, characterizing the shape of the distribution's tails.
Lepotokurtic: Distribution is thin and peaked ($ ext{Pearson kurtosis} > 3$).
Mesokurtic: Distribution is moderate in shape ($ ext{Pearson kurtosis} = 3$).
Platykurtic: Distribution is flat ($ ext{Pearson kurtosis} < 3$).
Modality: Refers to the number of peaks (modes) in the distribution. A distribution can be unimodal, bimodal, or multimodal.
Outliers: Data points that differ significantly from other observations, potentially influencing the mean and other statistical measures.
Example Maths Ability Scores
Scores: Scaled from 0 to 150.
Example Scores: 38, 53, 54, 58, 60, 62, 64, 65, 68, 70, 71, 73, 74, 76, 77, … , 147
Frequency Table of Maths Ability
Structure: Displays the frequency ( ext{number of occurrences}) of specific scores alongside their cumulative percentages.
Columns in Table:
Valid Frequency: The count of data points for given scores.
Percent: Percentage representation of the valid frequency in relation to total observations.
Cumulative Percent: Running total of the percentages, showing the percentage of scores that fall below a particular value.
Score | Valid Frequency | Percent | Cumulative Percent |
|---|---|---|---|
38 | 1 | 0.6 | 0.6 |
53 | 1 | 0.6 | 1.1 |
54 | 1 | 0.6 | 1.7 |
… | … | … | … |
147 | 1 | … | … |
Displaying Distributions Graphically
Histograms: Graphical representation that shows the distribution of scores by dividing the data into bins and counting the frequency of scores in each bin.
Polygons: Similar to histograms but display the frequency using a connected line across the midpoints of each bin.
The Normal Distribution
Characteristics: Most graphed distributions are discussed in relation to the normal distribution, characterized by its bell-shaped curve.
Natural Variables: Many naturally occurring variables, such as height, weight, and IQ, tend to follow a normal distribution.
Example: Individuals with extremely high or low scores are rare, whereas most people cluster around the average.
Description of Graphed Distributions
When describing a graphed distribution, refer to the following:
Kurtosis: Assess the peakedness (or flatness) of the graph.
Dispersion: Consider the spread or variability of the data.
Presence of Outliers: Evaluate if any extreme values deviate significantly from the rest.
Skewness: Consider if the distribution leans to one side (positive or negative skew).
Modality: Identify the number of peaks in the distribution.
Kurtosis Explained
Kurtosis Definition: Refers to how tall and sharp the peaks of a distribution are compared to a normal distribution.
Lepotokurtic Distribution: Tall and thin; $ ext{Pearson kurtosis} > 3$.
Mesokurtic Distribution: Moderately peaked; $ ext{Pearson kurtosis} = 3$.
Platykurtic Distribution: Flat; $ ext{Pearson kurtosis} < 3$.
Skewness Explained
Definition of Skewness: The degree of asymmetry of a distribution.
A positively skewed distribution: The tail on the right is longer; $ ext{mean} > ext{median} > ext{mode}$.
A negatively skewed distribution: The tail on the left is longer; $ ext{mean} < ext{median} < ext{mode}$.
Conclusion on Distribution Characteristics
To effectively describe the distributions observed in data:
Assess the central tendency, variability, presence of outliers, kurtosis, skewness, and modality.