Class2
Class Overview
Title: Analysis of Univariate Data
Focus: Graphics for qualitative and quantitative data
Course: Introduction to Statistics for Social Sciences, Statistics I
Institution: Department of Statistics, UC3M
Chapter Overview
Chapter 2: Analysis of Univariate Data
Frequency Table
Charts for Qualitative and Quantitative Discrete Data
Frequency Table for Continuous Data
Charts for Continuous Data
Recommended Reading: Wikipedia on misleading graphics
Frequency Tables
Purpose of Frequency Tables
Definition: A method for organizing, classifying, and summarizing sample information.
Frequency: Represents the number or proportion of times a value occurs in the sample.
Example: Preferred Political Party
Sample Size: 66 UC3M Students
Data: Responses from students regarding their preferred political party.
Data Breakdown
PSOE: 20 votes (30%)
PP: 18 votes (27%)
UP: 11 votes (17%)
VOX: 9 votes (14%)
Cs: 3 votes (5%)
Más Madrid: 3 votes (5%)
Other: 2 votes (3%)
Total Votes: 66 (100%)
Quantitative Data Example
Variable: Number of Times Tested for COVID-19
Sample: 66 UC3M Students
Test Frequency Output: 0, 1, 2 tests.
Frequency Breakdown
0 Times: 44 students (67%)
1 Time: 21 students (32%)
2 Times: 1 student (1%)
Total Students: 66 (100%)
Charts for Qualitative Data
Bar Chart
Usage: Suitable for qualitative (or discrete) data.
Importance: Maintain order for ordinal data.
Cumulative Frequency Bar Chart
Requirement: Only for ordinal and discrete data - order is critical for representation.
Misleading Visuals
Issues with Bar Charts
Awareness: Potential to mislead regarding the interpretation of data due to visual representation choices.
Pie Charts
Guidelines for Use
Application: For dichotomous or polytomous data with few categories.
Critique of Pie Charts
Quote by John Tukey: "There is no data that can be displayed in a pie chart that cannot be displayed better in some other type of chart."
Continuous Data Analysis
Frequency Table for Continuous Data
Example: Steps Walked Yesterday
Sample Size: 66 students
Output: Collection of step counts for the day, ranging from lower to higher activity levels.
Data Grouping
Need for Grouping: Proper intervals for data should be determined often through methods like √n.
Histogram
Purpose: To illustrate the form of data and frequency.
Frequency Polygon
Definition: A line graph joining midpoints of each data interval, showing cumulative data proportions.
Application of Histograms
Effect of Bar Count: Adjusting the number of bars can dramatically change the visual output and interpretation of data.
Examples of Data Visualization
Example: Weekly Cannabis Consumption
Grouped Data: Divided into intervals and showing respective frequencies and relative frequencies.
Exercises
Exercise 1
Task: Analyze life expectancy graphic from 2007 across continents. Evaluate appropriateness and potential alternatives.
Exercise 2
Task: Analyze max heart rate during physical activity based on 303 subjects. Define variable type and chart used. Discuss graphical suitability and alternatives.