2.1 Describing Location in a Distribution

CUMULATIVE RELATIVE FREQUENCY

  • instead of wanting to know what percent of the data falls into a particular interval, we often want to know what percent falls below a certain value

    • we will compute the cumulative relative frequency, which is the sum of the relative frequency of that class and all the classes below that

  • 0.01, 0.045, 0.09, 0.15, 0.265, 0.435, 0.675, 0.87, 0.965, 1

    • 9%, 96.50%, 85%, cannot be determined, 41%

  • the graph of such a distribution is called a cumulative relative frequency plot

PERCENTILE

  • the Pth percentile of a distribution is the value in the distribution such that P percent of the observations lie at that level or below

THE STANDARD DEVIATION AS A RULER-

  • use standard deviation to count who is greater; therefore better

  • standard deviation is distance from the mean

  • when we are using the standard deviation as a ruler to measure how far an observation is above or below the mean, we are using a standardized score, or a z-score

  • a z-score tells us how many standard deviations the original observation falls from the mean and in what direction

    • negative z-scores lie to the left of the mean

    • positive z-scores lie to the right of the mean

  • since the numerator of the formula above is measured in the same units as the denominator, the resulting z-score does not have units associated with it

  • when we standardize a value, we are actually shifting the data (by subtracting the mean) and rescaling the data)

SHIFTING AND RESCALING DATA

  • adding or subtracting the same number c to each value of a distribution will not change the overall shape or spread of the distribution; will only shift the distribution that many units to the left or right

    • the mean, median, quartiles, min, and max will all shift “c” units to the left or right

    • the range, IQR, and standard deviation will remain the same