Class2

Class Overview

  • Title: Analysis of Univariate Data

  • Focus: Graphics for qualitative and quantitative data

  • Course: Introduction to Statistics for Social Sciences, Statistics I

  • Institution: Department of Statistics, UC3M

Chapter Overview

Chapter 2: Analysis of Univariate Data

  1. Frequency Table

  2. Charts for Qualitative and Quantitative Discrete Data

  3. Frequency Table for Continuous Data

  4. Charts for Continuous Data

    • Recommended Reading: Wikipedia on misleading graphics

Frequency Tables

Purpose of Frequency Tables

  • Definition: A method for organizing, classifying, and summarizing sample information.

  • Frequency: Represents the number or proportion of times a value occurs in the sample.

Example: Preferred Political Party

  • Sample Size: 66 UC3M Students

  • Data: Responses from students regarding their preferred political party.

Data Breakdown

  • PSOE: 20 votes (30%)

  • PP: 18 votes (27%)

  • UP: 11 votes (17%)

  • VOX: 9 votes (14%)

  • Cs: 3 votes (5%)

  • Más Madrid: 3 votes (5%)

  • Other: 2 votes (3%)

  • Total Votes: 66 (100%)

Quantitative Data Example

Variable: Number of Times Tested for COVID-19

  • Sample: 66 UC3M Students

  • Test Frequency Output: 0, 1, 2 tests.

Frequency Breakdown

  • 0 Times: 44 students (67%)

  • 1 Time: 21 students (32%)

  • 2 Times: 1 student (1%)

  • Total Students: 66 (100%)

Charts for Qualitative Data

Bar Chart

  • Usage: Suitable for qualitative (or discrete) data.

  • Importance: Maintain order for ordinal data.

Cumulative Frequency Bar Chart

  • Requirement: Only for ordinal and discrete data - order is critical for representation.

Misleading Visuals

Issues with Bar Charts

  • Awareness: Potential to mislead regarding the interpretation of data due to visual representation choices.

Pie Charts

Guidelines for Use

  • Application: For dichotomous or polytomous data with few categories.

Critique of Pie Charts

  • Quote by John Tukey: "There is no data that can be displayed in a pie chart that cannot be displayed better in some other type of chart."

Continuous Data Analysis

Frequency Table for Continuous Data

Example: Steps Walked Yesterday

  • Sample Size: 66 students

  • Output: Collection of step counts for the day, ranging from lower to higher activity levels.

Data Grouping

  • Need for Grouping: Proper intervals for data should be determined often through methods like √n.

Histogram

  • Purpose: To illustrate the form of data and frequency.

Frequency Polygon

  • Definition: A line graph joining midpoints of each data interval, showing cumulative data proportions.

Application of Histograms

  • Effect of Bar Count: Adjusting the number of bars can dramatically change the visual output and interpretation of data.

Examples of Data Visualization

Example: Weekly Cannabis Consumption

  • Grouped Data: Divided into intervals and showing respective frequencies and relative frequencies.

Exercises

Exercise 1

  • Task: Analyze life expectancy graphic from 2007 across continents. Evaluate appropriateness and potential alternatives.

Exercise 2

  • Task: Analyze max heart rate during physical activity based on 303 subjects. Define variable type and chart used. Discuss graphical suitability and alternatives.