Measures of Variability

Variability and Measures of Central Tendency

  • Discussion of variability and its importance alongside measures of central tendency.

  • Two primary characteristics of a score distribution: central tendency and variability.

Measures of Variability

  • Different metrics available for measuring variability, some more useful than others.

  • Range

    • Defined as the difference between the largest and smallest scores.

    • Can be found in descriptive statistics but not deviation scores.

Deviation Scores

  • Definition: Deviation score is the difference between a score and the mean (Score - Mean).

  • Deviation scores always sum to zero. Example calculation displayed:

    • Mean = 10; Scores: 11 (1), 10 (0), 9 (-1), 8 (-2); Sum = 0.

  • Deviation scores are not ideal for representing the variability of a dataset as they focus only on individual scores.

Sum of Squares

  • Squaring deviation scores eliminates negative signs, resolving the cancellation issue when summed.

  • Definition: Sum of squares refers to the total of squared deviation scores.

    • Example: Squaring scores leads to a clearer representation of variability (e.g., 1^2 + 0^2 + (-1)^2 + (-2)^2 ...).

Variance

  • Obtained by dividing the sum of squares by N - 1, where N is the number of observations. This formula accounts for degrees of freedom.

  • Degrees of Freedom: N - 1. Explained with a practical example:

    • In choosing phone numbers that sum to a certain total, only three numbers can be chosen freely; the last is determined by the total.

Calculating Variance

  • To calculate variance for a sample:

    1. Calculate mean.

    2. Compute each deviation score.

    3. Square each deviation score.

    4. Sum the squared deviations.

    5. Divide by N - 1.

Standard Deviation

  • Found by taking the square root of the variance, returning to original measurement units.

  • Definition: Standard deviation measures how much scores deviate from the mean on average.

  • Important to memorize procedures for manual calculation. Quiz/exam questions may focus on this.

Interquartile Range (IQR)

  • Defined by dividing the dataset into quartiles, primarily used with median measures of central tendency.

  • IQR is useful when the median is preferred due to its resistance to skewed data.

  • Process for IQR:

    • Identify median; determine first and third quartiles of the lower and upper halves.

    • Calculate IQR as the difference between the upper and lower quartiles (Q3 - Q1).

    • Example: If Q3 = 8 and Q1 = 7, then IQR = 1.

Reporting Statistics Based on Data Distribution

  • When datasets are skewed, use the median and IQR for reporting instead of mean and standard deviation.

  • If datasets are symmetrical, mean and standard deviation are appropriate.

  • Importance of using appropriate measures depending on data distribution.

Conclusion

  • Reviewed measures of variation and variability including variance and standard deviation.

  • Advocated for proper handling of skewed datasets with median and IQR, alongside mean and standard deviation for symmetrical datasets.

robot