The median is the middle value in a sorted dataset.
For an odd number of elements, the median is the middle number.
For an even number of elements, the median is the average of the two middle numbers.
Example: Data set = [19, 20] → Median = (19 + 20) / 2 = 19.5
The mean is calculated by adding all values and dividing by the count of values.
Example: Mean of [19, 20] → (19 + 20) / 2
The mode is the most frequently occurring number in a dataset.
Example: In [17, 18, 18, 20], 18 is the mode.
An outlier is a value that significantly differs from the rest of the dataset.
Example: In [16, 17, 18, 54], 54 is an outlier.
There are no strict rules for defining outliers; it's typically based on intuition and context.
The median is often a better measure of central tendency in skewed datasets or those with outliers since it isn't affected by extreme values.
Example: Outlier 54 skews the mean, thus making median a better representation of central value, especially in age datasets.
Minimum: The smallest value in the dataset.
Example: In [16, 17, 18, 20], the minimum is 16.
Maximum: The largest value in the dataset.
Example: In [16, 17, 18, 20], the maximum is 20.
The range indicates the spread of the dataset and is calculated as:
Range = Maximum - Minimum.
Example: Range of [16, 54] = 54 - 16 = 38.
Quartiles divide the dataset into four equal parts:
Q1 (1st quartile): Median of the lower half of data.
Q2 (2nd quartile): The median of the dataset.
Q3 (3rd quartile): Median of the upper half of data.
To find Q1 and Q3, the data should be in order:
Example Data: [16, 17, 18, 19, 20, 22, 23, 30, 54]
Q1 = median of [16, 17, 18, 19] → 18
Q3 = median of [20, 22, 23, 30, 54] → 22
The IQR is the difference between Q3 and Q1 and measures the spread of the middle 50% of the data:
IQR = Q3 - Q1.
Example: IQR = 22 - 18 = 4.
Important for understanding data spread especially when there's an outlier present.
Percentiles indicate the relative standing of a value in a dataset.
Example: If a score is in the 80th percentile, it means the score is higher than 80% of the data points.
Percentiles can be determined by sorting data and identifying percent divisions.
Measures of Central Tendency: Mean, Median, Mode (represents the center of a dataset)
Measures of Spread: Range, IQR (describes how spread out the values are)
It's important not to confuse the two when analyzing data.