Using Statistics and Measurement Error To Inform Training
Using Reliability Measures for Decision Making
Smallest Worthwhile Change (SWC)
- Commonly used to estimate a worthwhile change in performance.
- Based on effect size statistics.
- Cohen's thresholds (1988) for effect sizes:
- Small: 0.2
- Moderate: 0.5
- Large: 0.8
- Hopkins (2000) adapted these:
- Small worthwhile change: 0.2 x between-subject standard deviation
- Moderate worthwhile change: 0.5 x between-subject standard deviation
- Large worthwhile change: 0.8 x between-subject standard deviation
- Calculation: Multiply the effect size threshold by the between-subject standard deviation.
Application
- Strength and conditioning coaches often look for small but worthwhile changes.
- Moderate effect sizes may be used for younger athletes with larger potential gains.
SWC and Reliability
- Typical error: \frac{\text{standard deviation of test}}{\sqrt{2}}
- If SWC > typical error: Changes exceeding SWC are considered real changes.
- If typical error > SWC: Changes reaching SWC may be due to normal variation (biological, technological, or protocol-related).
Limitations of SWC
- Arbitrary changes may be deemed meaningful.
- Represents a mean worthwhile change for a population, not individual responses.
- Individual athletes respond differently to training interventions.
- Dependent on sample distribution (requires normal distribution).
- Precision depends on sample size.
- Effect sizes were originally designed for large sample sizes (psychology) while strength conditioning often deals with smaller samples.
- Small sample sizes can inflate the perceived SWC.
Datsun et al. Study
- Paper examines whether SWC reflects real changes in female soccer players.
- Findings: SWC (0.2 x between-participants standard deviation) often missed changes practitioners considered real.
Considerations
- Assess SWC from a reliability standpoint (is measurement error > SWC?).
- Consider assumptions of effect sizes (sample size, distribution).
- Contextualize the SWC result based on measurement error and individual athletes.
- SWC is just a number; consider the practical significance for the specific situation.
Smallest Detectable Difference (SDD)
- Uses the standard error of measurement to calculate limits of meaningful difference.
- Conceptually similar to the limits of agreement approach.
- Formula: 1.96 \times \sqrt{2} \times \text{standard error of measurement}
- Note: the transcript says 1.6 which is not correct instead of 1.96.
- Example: 3RM deadlift study with three sessions separated by 48 hours.
- Dashed lines on the figure represent the calculated SDD.
- In the example, SDD was approximately 6 kg.
- All observations in the study fell within the 6 kg SDD.
- The 3RM test (standard error of measurement = 2.8 kg) was considered repeatable/reliable.
Considerations
- SDD is generally an arbitrary number based on a formula.
- The 6 kg SDD in the example may not be meaningful for all populations.
- Consider practitioner experience and athlete-specific factors when interpreting SDD.
- Statistical methods are important for determining measurement error, but experience is crucial for determining what a real change in performance means.
Analytical vs. Statistical Goals
- Reliability is not a binary (yes/no) question.
- It depends on the population, testing protocol, and athlete familiarity.
- Familiarize athletes with testing before making decisions based on data.
- Reliability is a spectrum, not a fixed state.
Determining Acceptable Measurement Error
- Analytical goals: What is the test being used for?
- Elite athletes (close to adaptive ceiling): Require highly accurate tests with low measurement error to detect small but meaningful changes.
- Lower-level athletes: Higher level of measurement error may be acceptable; motor learning patterns are less developed, and adaptation is rapid.
Contextualizing Statistical Outcomes
- Do not rely solely on statistics (e.g., SWC).
- Consider the context of the athlete(s).
- Decide, based on experience and the specific situation, whether the statistical outcome applies.
- What level of measurement error (e.g., coefficient of variation, typical error) is acceptable?
- Consult with colleagues to determine what constitutes a practical change.
- Even with SDD, determine if the calculated difference is meaningful for the specific group of athletes.