Course: ECO 221 Economic Statistics
Instructor: Professor Jose Xilau
Lecture: 4
Homework 1: Due today at 11:59 PM on Gradescope
Homework 2: Due Monday at 11:59 PM on Gradescope
Reading Assignments:
Read TH Chapter 4
Complete reading comprehension questions for TH Chapter 4
Optional reading of Moore & McCabe 1.2, 2.2
Data Definition:
Overview of data and exploratory data analysis.
Methods of Displaying Data:
Line graphs, bar graphs, histograms.
Importance of understanding how two variables interact and relate to each other.
Key methods to display relationships, such as scatterplots and using Stata for graphical representation.
Purpose of Data Collection:
Why was the data collected?
Identification of Subject:
Who or what do the data describe?
Definitions and Measurements:
What is the definition of variables? What are the units of measurement?
Data Collector Insights:
Who collected the data and what were their motivations?
Context:
In what context was the information provided? Did respondents know they were being studied?
Subject Knowledge:
Consider what you already know about the topic being studied.
Often, the interest lies in the relationships between multiple variables, leading to insightful statistics.
Measurement should be conducted for both variables using the same observational units.
Conducted at Ohio State University where student volunteers drank varying amounts of beer.
Findings: Higher beer consumption correlates with higher blood alcohol content.
Researchers at University of Adelaide studied 164 dog breeds to determine lifespan influences.
Conclusion: Larger breeds had a higher likelihood of dying from cancer at a younger age, attributed to selective breeding practices.
Definition of Association:
Two variables are associated if knowledge of one provides informative insights about the other.
Examples of associations:
Number of beers affects blood alcohol content; dog breed correlates with lifespan.
Associations can range from strong to weak, with some showing undeniable trends while others exhibit exceptions.
Example: Smokers generally have shorter lifespans, yet some may outlive non-smokers.
Response Variable: Measures the outcome of a study.
Explanatory Variable: Accounts for changes in the response variable.
Example:
Blood Alcohol Content (Response) and Number of Beers (Explanatory).
Definition:
Graphical representation to show relationships between two quantitative variables.
Axes Setup:
Y-Axis: typically the response variable.
X-Axis: typically the explanatory variable.
Each observation is represented as a point based on the values of both variables.
Causality Note:
A scatter plot indicates correlation but does not imply causation; a causal mechanism must be identified to argue for causation.
Article link discusses correlation between rising ice cream sales and increasing crime rates.
Patterns/Shapes
Strength of Relationship
Direction of Relationship: Positive or negative.
Outliers: Values significantly differing from the pattern.
Linear Relationships:
Confirmed when dots follow a straight line (upward or downward sloping).
Non-Linear Relationships:
Observed when data doesn't align with a straight trajectory, showing varied slopes.
Directionality:
Upward slopes indicate positive relationships and downward indicate negative relationships.
Defined as individual values outside the main pattern which may skew interpretations.
Reminder about upcoming assignments and reading tasks to prepare for the next lecture.