Topic Five Spring 2020
DESCRIBING DATA: VISUAL AND NUMERICAL DESCRIPTIONS
Dr. Erin K. Freeman
Topic Five
TOPIC FIVE OBJECTIVES
Purpose of visual and numerical descriptions
Distinguish between good and poor visual descriptions
Create and interpret visual charts and graphs
Define and distinguish measures of center in data distributions
Define and distinguish measures of variability in data distributions
Understand the concepts of shape and outliers in data distributions
WHY VISUAL DESCRIPTIONS?
Simplify interpretation of large information sets
Visual aids often more effective than words in conveying messages
Statistical graphics preferred for data summarization
Essential skill in data visualization creation and interpretation
DATA VISUALIZATION
Characteristics: simple, thorough, accurate, impactful
Innovations informed by neuroscience
Includes descriptive and exploratory data analysis
BANDWIDTH OF SENSES (David McCandless)
Sight: 1250 MB/s
Touch: 125 MB/s
Hearing: 12.5 MB/s
Same as computer networks
COMMON GRAPHICAL ERRORS
Omitting baselines (zero points)
Manipulating axes
Cherry-picking data
Using inappropriate chart types
Excessive grid lines or irrelevant labels
Failing to adjust dollar amounts for inflation
DESCRIPTIVE STATISTICS
Measures of Center
Mode: Most frequently occurring score
Median: Midpoint that divides data; robust against outliers
Mean: Arithmetic average, sensitive to extremes
Measures of Variability
Range: Difference between max and min values
Interquartile Range (IQR): Spread of middle 50% of data
Variance and Standard Deviation: Average distance of scores from the mean
SHAPE OF DISTRIBUTION
Normal Distribution: Symmetric, unimodal
Skewness: Direction of tail affects central tendency measures' choice
Commonly skewed distributions: income (positive), grades (negative)
CENTRAL TENDENCY
Summary statistic reflecting typical value
Different measures minimize error for different distributions
Central tendency decisions influenced by data shape
FIVE NUMBER SUMMARY
Max, Upper Hinge (Q3), Median, Lower Hinge (Q1), Min
Visualized via box plots to assess shape and outliers
BOX PLOT
Represents five number summary
Helps visualize data shape and identify outliers
IMPLICATIONS
Assessing variability provides context beyond central tendency
Knowing central tendency alone can be misleading
Importance of visual representations in data interpretation
NEXT TOPIC
Topic 6: Normal Distributions