Econometrics
Week 1
08/26/2025
Midterm:
October 9
December 2
Final:
Work in groups of 2
Write Empirical paper
Need to use Strata
Represent what you learn
Buy textbook and read it
Homework (08/28/2025)
Important statistical concepts used in Econometrics:
Measure of central tendency:
Mean
Median
Measures of dispersion:
Variance
Standard Deviation
Minimum, Maximum, and Range
Skewness and Kurtosis
Correlation, covariance
Confidence interval
Mean
Measure of central tendency
The mean is the arithmetic average of the data.
Suppose to have N observation of X, then Mean is the sum of X’s divided by N
Median
Another measure of central tendency
Median is the middle observation when the data are arranged from smallest to largest.
Sometimes called the 50th percentile.
Half the observations lie below the median and half the observations live above the median.
Central observation for an odd number of observations and an average of the two middle data points for an even number of observations.
Measures of Dispersion:
Variance
Measure of dispersion (how scattered the data is)
The variance (sample) is calculated by subtracting the mean from each observation, squaring that value, adding up all N values, and then dividing that by the number of observations less one.
Standard deviation
Another measure of dispersion
Measures the average deviation of the values in the dataset away from the mean
It is the square root of the variance
Covariance and Correlation Coefficient
Provides numerical value to the strength and direction of the linear relationship between two variables.
Only concerned with strength of the relationship.
No casual effect is implied!
Covariance:
Measure of linear relationship between two random variables Think of variance (measures how X varies with itself)
Correlation Coefficient:
Degree of joint variation between Y and X as a fraction for the individual variations in Y and X scaled, removes the interpretation problem:
Covariance and Correlation Coefficient Interpretation
Covariance:
Positive:
Above average values of X associated with above values of Y
Negative:
Above average values of X associated with below average values of Y
Problem with the covariance measure:
We do not know whether the magnitude is large or small because of the units that we choose.
Correlation Coefficient:
If all data points in a data set fall on a positively sloped line, rxy =1.
The closer to positive 1, the stronger the positive linear relationship.
If all the data points in a data set fall on a negatively sloped line, rxy =-1.
The closer to negative 1, the stronger the negative linear relationship.
If there is no linear relationship between X and Y, then rxy =0.
The closer to 0, the weaker the linear relationship.
Random Variables
A random variable is a numerical outcome of a random process.
Two types:
Discrete random variables - take on countable values (number of heads in a coin toss basically).
Continuous random variables - take on any variable within an interval (height or income basically)
Notation:
Often denoted by capital letters (X,Y)
Values:
Represented by lowercase letters (x,y)
In econometrics, random variables are used to model uncertainty data.
Random Variable and Expectation
Expectation (or expected value) represents the long-run average of a random variable.
It provides a measure of the “center” of the distribution.
For a discrete random variable X:
Where P(X = x) is the probability that the random variable, X takes value “x”.
08/28/2025
Econometrics
Literally means “economic measurement”
Econometrics is a science and art of using economic theory and statistical techniques to analyze economic data.
Econometrics attempts to quantitatively bridge the gap between economic theory and the real world.
Venn Diagram:
Economic on the left
Statistics on the right
Econometrics in the middle
Week 2
09/02/2025
Regression Equation
Y = B0 + B1X