2.1 Describing Location in a Distribution
CUMULATIVE RELATIVE FREQUENCY
instead of wanting to know what percent of the data falls into a particular interval, we often want to know what percent falls below a certain value
we will compute the cumulative relative frequency, which is the sum of the relative frequency of that class and all the classes below that
0.01, 0.045, 0.09, 0.15, 0.265, 0.435, 0.675, 0.87, 0.965, 1
9%, 96.50%, 85%, cannot be determined, 41%
the graph of such a distribution is called a cumulative relative frequency plot
PERCENTILE
the Pth percentile of a distribution is the value in the distribution such that P percent of the observations lie at that level or below
THE STANDARD DEVIATION AS A RULER-
use standard deviation to count who is greater; therefore better
standard deviation is distance from the mean
when we are using the standard deviation as a ruler to measure how far an observation is above or below the mean, we are using a standardized score, or a z-score
a z-score tells us how many standard deviations the original observation falls from the mean and in what direction
negative z-scores lie to the left of the mean
positive z-scores lie to the right of the mean
since the numerator of the formula above is measured in the same units as the denominator, the resulting z-score does not have units associated with it
when we standardize a value, we are actually shifting the data (by subtracting the mean) and rescaling the data)
SHIFTING AND RESCALING DATA
adding or subtracting the same number c to each value of a distribution will not change the overall shape or spread of the distribution; will only shift the distribution that many units to the left or right
the mean, median, quartiles, min, and max will all shift “c” units to the left or right
the range, IQR, and standard deviation will remain the same