ECO221_LECTURE4

Lecture Overview

  • Course: ECO 221 Economic Statistics

  • Instructor: Professor Jose Xilau

  • Lecture: 4

Deadlines & Assignments

  • Homework 1: Due today at 11:59 PM on Gradescope

  • Homework 2: Due Monday at 11:59 PM on Gradescope

  • Reading Assignments:

    • Read TH Chapter 4

    • Complete reading comprehension questions for TH Chapter 4

    • Optional reading of Moore & McCabe 1.2, 2.2

Previous Lecture Recap

  • Data Definition:

    • Overview of data and exploratory data analysis.

  • Methods of Displaying Data:

    • Line graphs, bar graphs, histograms.

Current Lecture Focus

Relationships Between Two Variables

  • Importance of understanding how two variables interact and relate to each other.

  • Key methods to display relationships, such as scatterplots and using Stata for graphical representation.

Critical Questions Before Analyzing Data

  • Purpose of Data Collection:

    • Why was the data collected?

  • Identification of Subject:

    • Who or what do the data describe?

  • Definitions and Measurements:

    • What is the definition of variables? What are the units of measurement?

  • Data Collector Insights:

    • Who collected the data and what were their motivations?

  • Context:

    • In what context was the information provided? Did respondents know they were being studied?

  • Subject Knowledge:

    • Consider what you already know about the topic being studied.

Importance of Analyzing Multiple Variables

  • Often, the interest lies in the relationships between multiple variables, leading to insightful statistics.

  • Measurement should be conducted for both variables using the same observational units.

Examples of Variable Relationships

Case Study: Beer Consumption vs. Blood Alcohol Levels

  • Conducted at Ohio State University where student volunteers drank varying amounts of beer.

  • Findings: Higher beer consumption correlates with higher blood alcohol content.

Case Study: Dog Breed vs. Lifespan

  • Researchers at University of Adelaide studied 164 dog breeds to determine lifespan influences.

  • Conclusion: Larger breeds had a higher likelihood of dying from cancer at a younger age, attributed to selective breeding practices.

Understanding Association Between Variables

  • Definition of Association:

    • Two variables are associated if knowledge of one provides informative insights about the other.

  • Examples of associations:

    • Number of beers affects blood alcohol content; dog breed correlates with lifespan.

Strength and Nature of Associations

  • Associations can range from strong to weak, with some showing undeniable trends while others exhibit exceptions.

    • Example: Smokers generally have shorter lifespans, yet some may outlive non-smokers.

Response vs. Explanatory Variables

  • Response Variable: Measures the outcome of a study.

  • Explanatory Variable: Accounts for changes in the response variable.

  • Example:

    • Blood Alcohol Content (Response) and Number of Beers (Explanatory).

Scatterplots

  • Definition:

    • Graphical representation to show relationships between two quantitative variables.

  • Axes Setup:

    • Y-Axis: typically the response variable.

    • X-Axis: typically the explanatory variable.

  • Each observation is represented as a point based on the values of both variables.

Best Practices in Scatterplots

  • Causality Note:

    • A scatter plot indicates correlation but does not imply causation; a causal mechanism must be identified to argue for causation.

Analyzed Example: Crime and Ice Cream Sales

  • Article link discusses correlation between rising ice cream sales and increasing crime rates.

Key Observations in Scatterplots

  1. Patterns/Shapes

  2. Strength of Relationship

  3. Direction of Relationship: Positive or negative.

  4. Outliers: Values significantly differing from the pattern.

Patterns and Strength in Scatterplots

  • Linear Relationships:

    • Confirmed when dots follow a straight line (upward or downward sloping).

  • Non-Linear Relationships:

    • Observed when data doesn't align with a straight trajectory, showing varied slopes.

  • Directionality:

    • Upward slopes indicate positive relationships and downward indicate negative relationships.

Outliers

  • Defined as individual values outside the main pattern which may skew interpretations.

Closing Remarks

  • Reminder about upcoming assignments and reading tasks to prepare for the next lecture.

robot