Discussion of variability and its importance alongside measures of central tendency.
Two primary characteristics of a score distribution: central tendency and variability.
Different metrics available for measuring variability, some more useful than others.
Range
Defined as the difference between the largest and smallest scores.
Can be found in descriptive statistics but not deviation scores.
Definition: Deviation score is the difference between a score and the mean (Score - Mean).
Deviation scores always sum to zero. Example calculation displayed:
Mean = 10; Scores: 11 (1), 10 (0), 9 (-1), 8 (-2); Sum = 0.
Deviation scores are not ideal for representing the variability of a dataset as they focus only on individual scores.
Squaring deviation scores eliminates negative signs, resolving the cancellation issue when summed.
Definition: Sum of squares refers to the total of squared deviation scores.
Example: Squaring scores leads to a clearer representation of variability (e.g., 1^2 + 0^2 + (-1)^2 + (-2)^2 ...).
Obtained by dividing the sum of squares by N - 1, where N is the number of observations. This formula accounts for degrees of freedom.
Degrees of Freedom: N - 1. Explained with a practical example:
In choosing phone numbers that sum to a certain total, only three numbers can be chosen freely; the last is determined by the total.
To calculate variance for a sample:
Calculate mean.
Compute each deviation score.
Square each deviation score.
Sum the squared deviations.
Divide by N - 1.
Found by taking the square root of the variance, returning to original measurement units.
Definition: Standard deviation measures how much scores deviate from the mean on average.
Important to memorize procedures for manual calculation. Quiz/exam questions may focus on this.
Defined by dividing the dataset into quartiles, primarily used with median measures of central tendency.
IQR is useful when the median is preferred due to its resistance to skewed data.
Process for IQR:
Identify median; determine first and third quartiles of the lower and upper halves.
Calculate IQR as the difference between the upper and lower quartiles (Q3 - Q1).
Example: If Q3 = 8 and Q1 = 7, then IQR = 1.
When datasets are skewed, use the median and IQR for reporting instead of mean and standard deviation.
If datasets are symmetrical, mean and standard deviation are appropriate.
Importance of using appropriate measures depending on data distribution.
Reviewed measures of variation and variability including variance and standard deviation.
Advocated for proper handling of skewed datasets with median and IQR, alongside mean and standard deviation for symmetrical datasets.