Descriptive statistics involve summarizing and analyzing numerical data to draw meaningful conclusions.
Two main categories:
Measures of Central Tendency
Measures of Dispersion
Key Terms
Descriptive statistics: Use of graphs, tables, and summary statistics to identify trends and analyze sets of data.
Measures of central tendency: General term for any measure of the average value in a set of data.
Measures of Central Tendency
Key measures:
Mean
Median
Mode
Mean
Definition: The arithmetic average calculated by adding all values and dividing by the number of values.
Calculation example:
Given scores: $7, 9, 10, 11, 12, 14, 15, 17$.
Total = 107, Number of scores = 10
Mean: extMean=10107=10.7.
Characteristics:
Most sensitive of the measures of central tendency because it includes all scores in calculation.
Can be easily distorted by extreme values.
Example: Replacing 17 with 98 changes mean from 10.7 to 18.8.
Median
Definition: The middle value when scores are arranged from lowest to highest.
Calculation:
Odd number of scores: Directly identified.
Even number of scores: Average of the two middle scores.
Example with ten scores: Middle scores are 10 and 11, Median: (10+11)/2=10.5.
Strengths:
Not affected by extreme values, unlike the mean.
Easy to calculate once arranged.
Limitations:
Less sensitive since it ignores the actual values of the lower and higher numbers.
Mode
Definition: The most frequently occurring value in a dataset.
Characteristics:
Can have multiple modes (bimodal) or no mode at all if all values are different.
Very easy to calculate, but may not represent the dataset well.
Example: For the scores $7, 9, 10, 11, 12, 14, 15, 17$, Mode is 7, which is not representative of the dataset.
Important in categorical data analysis, where it may be the only measure available (e.g., favorite dessert).
Measures of Dispersion
Definition: Measures that describe the spread of scores in a dataset.
Focus on two measures:
Range
Standard Deviation
Range
Definition: Difference between the highest and lowest values plus one as a correction.
Calculation:
extRange=(extHighestValue−extLowestValue)+1.
Example:
For scores: $0, 47, 49, 50, 51, 53, 54, 56, 56, 57, 100$, the range is extRange=(100−0)+1=101.
Advantages:
Easy to calculate.
Limitations:
Only considers two extreme values; may not represent the overall data distribution well.
Example highlights how extreme values (like 0 and 100) can misrepresent the general trend of scores.
Standard Deviation
Definition: A sophisticated measure of dispersion that indicates how far scores deviate from the mean.
Characteristics:
Larger standard deviation indicates greater spread of scores.
Suggests not all participants were affected similarly by the independent variable (IV).
Smaller standard deviation indicates scores are closely clustered around the mean.
Calculation:
Calculate the mean, compute differences from the mean for each score, square these differences, and then average them (variance). The standard deviation is the square root of the variance.
Limitations:
Can also be distorted by extreme values, similar to the mean, and may not show all details of data distribution.
Application of Concepts
Importance of understanding which measure of central tendency to use based on data characteristics:
Consider extreme scores: if present, median is more suitable; otherwise, mean is generally preferred.
Mode is primarily relevant for categorical data.
Study Tips
Familiarize yourself with the specifications on how to calculate these statistics, particularly mean, median, mode, and range. Calculators can be used for assistance.
Understanding the calculation of standard deviation enhances comprehension of data spread, practice with different datasets to observe changes in standard deviation.
Pay attention to extreme scores when deciding which measure of central tendency to use, as they can significantly impact the mean.