Descriptive Data Analysis
Descriptive Data Analysis
Descriptive data analysis involves techniques and statistical methods to describe the characteristics and performance of a group.
- Focuses on the overall sample.
- Can be used to compare individual scores to the group.
- Differs from inferential data analysis as it does not draw conclusions or infer training methods.
Types of Descriptive Statistics
- Measures of Frequency: How often a score or test result occurs.
- Measures of Distribution: The spread of the frequency.
- Measures of Central Tendency: Points around which most scores are concentrated (mean, median, mode).
- Measures of Variability: The spread of the data.
Measures of Central Tendency
Indicate the most common test scores or results.
Mode
- Describes the most frequent score in the test.
- Suitable for nominal data (data that cannot be ranked).
- Example: Determining the most frequently achieved score in a 1RM test or analyzing simple knowledge tests.
Median
- Describes the middle score of the data.
- 50% of the data is below this value, and 50% is above.
- Requires ordinal, interval, or ratio data.
- Should not be used with nominal data.
Mean
- Calculated as the sum of the scores divided by the number of scores.
- Provides an average of the group's performance.
- Formula: , where represents the individual scores and is the number of scores.
- Accuracy is affected by the distribution of the data; outliers can impact the mean's magnitude.
Data Distribution
Normal Distribution
- Data is clustered around a central point (symmetrical histogram).
- The mean is generally the most appropriate measure of central tendency.
Skewed Distributions
- Negatively Skewed: Distribution is longer to the left.
- Positively Skewed: Distribution has more scores towards the right.
- The median should be used in these cases as it is not affected by the magnitude of scores.
Importance of Distribution
- Most statistical measures (particularly those related to means) rely on normally distributed data.
- Normal distribution is defined where most values fall within two standard deviations of the mean.
- If data is not normally distributed, using the median and appropriate statistical models is recommended.
Measures of Variability
Describe the spread of the data.
Range
- Calculated by finding the minimum and maximum values in the dataset.
- Provides the least amount of information about the spread.
Interquartile Range
- Gives a representative idea of the spread around the median.
- Calculated using the 25th percentile below and above the median.
Standard Deviation
- Describes the amount a score differs from the mean.
- Indicates the accuracy of the mean value estimate.
- Smaller standard deviation: athletes are closer to the mean.
- Wider standard deviation: greater variability in the data.
Describing Individual Level Characteristics
Percentiles
- Rank athletes' scores from worst to best.
- Assign a percentile based on their rank within the data range.
- Example: An athlete at the 60th percentile has 60% of the scores below them.
Standard Scores
- Describe how far an individual's test scores are from the mean of the group.
- Z Score:
- A common standard score in strength and conditioning.
- Formula: , where is the athlete's test score, is the mean score for the group, and is the standard deviation for the group.
- Indicates how many standard deviations an athlete is away from the mean.
- Standard Error of the Mean:
- Describes the distance the population mean is likely to be away from the sample mean.
- Confidence Intervals:
- Allow assessment of the accuracy of the sample mean estimate using standard deviation or standard error of the mean.