Quantitative Economics and Econometrics

0.0(0)

Studied by 1 person

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/112

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

113 Terms

New cards

autocorrelation

serial correlation of the same variable across two time periods

e.g. Bob is selected to participate in the survey where he reports his mood every week. The response in week 1 may be systematically correlated to the response in week 2.

New cards

unbiased estimator

Expectation of the estimator = true value of the estimator

New cards

Variance (general formula)

E(x²) - [E(x)]²

New cards

Panel data

Data that spans multiple time periods

New cards

Repeated cross-section? Is there autocorrelation?

In each time period, a new sample is selected. This means that across time periods and within time period, there is independence.

NO there is no autocorrelation.

New cards

tobit - when to use

there exist corner solutions, or censoring

New cards

when will truncation (or nonresponse) lead to OLS be unbiased/consistent

if the selection criteria is wholly independent of u

on the other hand, if the selection criteria depends on y (e.g. s=1 iff y>c, then s is wholly dependent on u)

New cards

first-differenced equation

taking two cross sections in two time periods, then subtracting to get a “change in” OLS eqn

New cards

distributed lag model

when there are lagged indep variables included in the regression

e.g. trash2025 = b0+ b1*govbudget2023 + b2*govbudget2022

New cards

time dummies in first-difference eqns

why do we need them?
how do we interpret the coeff on the time dummy

for removing the time-invariant, fixed effects (a)
since it would be 1 for the year and -1 for the year after, it measures the effect of entering/exiting that year on y

New cards

chow test

what is it for
how do we do it
when is DiD or First Differences better?

write the formula

for comparing before and after a structural change
same as in F-test. compute f-statistic. But make sure the regression has dummy variables interacting on regressors BEFORE and then AFTER the break
DiD is better for a TREATMENT to determine causality. First differences is better when you know there are fixed effects influencing individuals/cities in diff ways

formula: let SSR1+SSR2=SSR_UR then it is the same as the F statistic formula.

SSR_R - (SSR₁ + SSR₂) / q

———

(SSR₁+SSR₂) / df

New cards

problems with first-differencing

key assumption that a needs to be uncorrelated with all x
sometimes variables do not vary much across two periods (or even more! e.g. educ)

New cards

how to deal with OVB (5 possibilities)

include it as a regressor
include more controls s.t. conditional exogeneity holds
use panel data, fixed effects / first-differencing, so every individual is observed multiple times (this assumes OVB doesnt vary over time)
IVs
run a randomised controlled experiment

New cards

what is the OLS variance
how is the IV variance different
compare sizes

sigma² / SSTx
there is also a R²(x,z) term in the denominator
IV variance is larger

New cards

how to know if we shd use OLS or TSLS

Hausman Test

New cards

CEV assumption

ALL regressors (including the TRUE REGRESSOR) is independent of the measurement error

New cards

bias vs consistency vs efficiency

bias = tends to produce the wrong outcome

consistency = tends to produce the same outcome (low var)

efficiency = doesn’t need a large sample size to produce a given outcome

New cards

what does logit/probit replace? why is it better?

replaces LPM

better bcos only returns fitted values between 0 and 1

New cards

what is the beta in a logit/probit (interpret)

the estimated beta in a logit/probit model is the value which maximises the log-likelihoods across all observations

New cards

How do we correct for sample selection bias

Heckman Correction

New cards

How do we perform Heckman Correction

Write the equation. s=1[gamma*z + v] is the selection criteria
Perform a probit and we are interested in the Inverse Mills Ratio lambda(gamma*z) for each observation
End goal: regress y on x and Lambda(Gamma*z)

New cards

What does Tobit rely on? (assumptions))

Homoskedasticity of error terms is a MUST

Error terms MUST also be normally distributed

New cards

Is it a problem to truncate x or truncate y?

Truncate y, because this leads to systematic bias in the sample

New cards

Truncation vs censoring?

Truncation is when values above/below a certain y are completely tossed away from a sample. They are out of the dataset.

Censoring is ALL individuals are sampled, they are in the dataset, but then the dependent variable is restricted to a certain range (e.g. top-coding, where salary “500,000 gbp and more” is a selection)

Truncation - use MLE truncreg (OR Heckit if truncation is from an unobserved process, like self-selection bias e.g. unemployed people are excluded from the sample)

Censoring - use Tobit

New cards

What happens (to the regression, and so to the coefficients) when you use OLS to erroneously predict a top-coded regression sample?

Since values above a certain y are censored-off, then the top-coded-out responses artificially ‘flatten’ the OLS regression

OLS produces coefficients way way below what is the true value (in fact, it tends toward zero)

New cards

Incidental truncation (sample selection bias), definition?

When you toss away samples that don’t meet a certain selection criteria (e.g. s=1[y>50]) and the selection criteria is not random

However, if the selection criteria is based on x and not y, OLS remains consistent and unbiased. E.g. if you use IQ to predict wage. But you restrict the IQ to be x>100 then OLS is still valid, but the interpretation is just slightly different in that you are estimating the effects for the population of which IQ>100

New cards

Why is it only a problem if we truncate y?

Truncating x does not give any issues, unbiasedness and consistency still hold.

But doing so for y is problematic because y and u are correlated. Hence, by truncating y, the sample selection is correlated with u (MLR 4 violated)

New cards

Steps to perform Hausman Test?

H0: cov(x1,u) = 0 (exogeneity) → use OLS

Ha: cov(x1,u) =/= 0 (endogeneity) → use TSLS

Regress x on z to obtain fitted values of x (I.e. run the first stage of TSLS)
For each observation, subtract x-hat from x, to yield v-hat
Regress y on x (original values) and v-hat
T-test to check significance of coefficient on v-hat

New cards

Derive the OLS (SLR) beta-hat from the definition.

check tablet notes for answer derivation steps

New cards

Derive the OLS (MLR) beta-hat step by step using the partialling-out method.

Regress xi1 on xi2, with ri1 being the error term and alphas as coefficients
Regress y on ri1 (y = B0 + B1ri1 + v)
Rewrite the OLS SLR beta-hat.

New cards

What are the ways we can test for multiple exclusion restriction in MLE? And why do we want to do it?

Compare this with the OLS case. Why is there a different methodology for MLE?

Wald statistic (in x)
LM statistic (in m)
LR statistic (in y) - we know the form = 2

the point is to check whether adding more indep variables makes a difference, or to see whether the indep variable is affecting the dep var at all

In OLS, we do the same using the t-statistic. MLE has diff ways to find bc u is not assumed normally distributed.

New cards

weak vs strict exogeneity

weak: E(u|x) = 0

strict: E(u|X) = 0 (as in, the average of error terms must be uncorrelated across time periods)

New cards

OLS exogeneity vs TS exogeneity

(Why do we need strict exogeneity in TS?)

under cross-sections/OLS, we need just weak exogeneity because the samples are chosen from a random process

under TS, we need strong exogeneity because samples are not random.

New cards

Bidirectionality in TS? Is it ok

If y affects x too, (endogeneity), generally TS3 condition is violated - strict exogeneity does not hold true

New cards

How do we model an exponential time series

use log specification on y, linear on independent variables

New cards

spurious regression problem

when x is erroneously concluded to have a positive effect on y.

when in fact an omitted variable TIME naturally causes both x and y to increase with time.

New cards

When there exist serially correlated errors, standard OLS variances are invalid because…

They assume that the covariances of the errors across observations are zero.

New cards

The number of lags in an AR or DL model is best chosen using…

An information criterion like BIC (Bayesian Information Criterion)

New cards

What are the appropriate ways to adjust for seasonality? (3)

Include month dummies
Use trigonometric functions as controls
Use an annual change transformation (1st differencing)

New cards

Why does QLR have different test statistics to T? Is it larger or smaller than T?

The QLR test statistic is the maximum out of many test statistics. It is therefore larger than T.

New cards

Random walk process - definition? What happens to the variance? And what about stationarity? What type of trend is this?

When is the only time a random walk “OLS consistent”?

When the autocorrelation term (Beta) is exactly equal to one. Variance increases linearly over time. Violates stationarity and weak dependence. This is a stochastic trend (unit root)

A random walk can be OLS consistent ONLY if the error term is stationary AND the regressor and regressand are cointegrated (cointegrated = fancy way of saying they move together. e.g. short term and long term interest rates)

New cards

Because of lack of what in TS do we need additional/new requirements compared to OLS? And what are these requirements?

Lack of random sampling

New requirements: Weak dependence, stationarity, and strict exogeneity

New cards

MA model (Check TS requirements)

Moving average model

Satisfies both weak dependence and stationarity
Values, especially errors from more than 2 periods ago has no effect on current y

y = beta(e_t-1) + e_t

New cards

Assumption we need to make to establish stationarity?

u is independent of the dependent variable cov(u,x_t)=0

New cards

AR(q) model assumption for weak dependence?

Beta must be less than 1.

New cards

Derive the mean and variance of a stable AR model (What must we assume? What abt the autocorrelation?)

(ASSUME STATIONARITY. As it is autoregressive, there exists autocorrelation)

By stationarity, E(y_t) = E(y_t-1). Take expectation of both sides of the model. Then E(y_t) = B₀ / (1-B₁)

Take variance of both sides of model. By stationarity, Var(y_t) = Var(y_t-1).

Var(y_t) = sigma²_e / 1-B₁²

New cards

Derive the mean and variance of a random walk AR? And with drift term?

With drift term:

The mean is E(y_t) = alpha₀t + E(y₀)
The variance is Var(y_t) = sigma_e²t

New cards

Log-likelihood function (Write down the formula) for Probit/Logit

What does the MLE try to do?

The MLE chooses beta to maximise the log-likelihood function

New cards

CDF of Probit

Probit uses the normal distribution.

The CDF is therefore the area underneath the PDF. PDF = 1/sqrt(2pi) * exp(-z²/2)

So take the integral of PDF evaluated from -inf to z.

New cards

CDF of Logit

New cards

How do we test whether to include more lags in forecasting model? (1 do and 1 don’t)

In the first place, use AIC/BIC to determine the best number of lags.

If unsure, DO compute the RMSFE in the regression w/ and w/o the new lag variable. If it FALLS then yes include the lag.

DO NOT test for coefficient significance on lag (this does not necessarily tell us whether the new variable improves forecasting - pvalues are good for within-sample inference but not OOS forecasting)

New cards

forecast interval assumptions and formula

assume:

u is normally distributed

if using transformations (like log), the forecasted change is small

formula:

y-hat_T+1|T± 1.96(RMSFE-hat)

New cards

Is it ok if coefficients are biased in forecasting

Yes because we just care whether the forecasting is correct

New cards

How do I obtain the effect of x on y if using PROBIT/LOGIT?

If x is a discrete regressor, then simply plug in values into the cdf. For example if we want to measure the effect of going from x=3 to x=4,

CDF(B0 + B1(3)) - CDF(B0 + B1(4))

If x is a continuous regressor, then we would have to find the marginal. = G’(B0 + B₁x)B₁ Take the derivative of the CDF and evaluate at x=3. In the probit case that would be the Normal PDF. In the logit case that is PDF=CDF(1-CDF)

New cards

Logit pdf

PDF = CDF(1-CDF)

Where CDF = exp(.)/(1+exp(.))

New cards

What is the key assumption to probit

We are using the normal distribution, specifically the standard normal (mean = 0, variance = 1)

We need to standardise normal distributions that are nonstandard, by dividing by sigma

New cards

Partial effect at the average (Formula?)

Average partial effect (Formula?)

PEA = pdf(B^Tx)*B_j

APE = (See image)

New cards

how do we know whether the data fits well for probit/logit/tobit

use Pseudo R² = 1- [ln(L_UR)/lnL_R]

Where L is the log-likelihood

New cards

Log-likelihood function (Write down the formula) for Tobit?

How is the function different if it is not ‘above’ censored but instead ‘below’ censored?

Note: The ENTIRETY of c_i - BtX should be divided by sigma (error in the slides)

If it is below censored, then the sign obviously flips, but also no need to do 1- in the first bit.

<p>Note: The ENTIRETY of c<sub>i</sub> - BtX should be divided by sigma (error in the slides)</p><p></p><p><em>If it is below censored, then the sign obviously flips, but also no need to do 1- in the first bit.</em></p>

New cards

What does beta measure in Tobit?

Beta measures the effect of x_j on the observed values of y, i.e. y*

New cards

Tobit conditional partial effect (what is it? Formula?)

Conditional partial effect is E(y|x, y>0)

New cards

Inverse mills ratio formula

PDF/CDF

New cards

Tobit unconditional partial effect (What is it similar to? Definition? Formula?)

Unconditional partial effect is E(y|x).

The formula is similar to the probit/logit continuous PEA. Except note that we use cdf instead of pdf. and we must standardise the value.

“Unconditional” because it is NOT conditional on y being greater or less than the censor threshold c

<p>Unconditional partial effect is E(y|x).</p><p></p><p>The formula is similar to the probit/logit continuous PEA. Except note that we use <strong>cdf instead of pdf. and we must standardise the value.</strong></p><p><strong><br>“Unconditional” because it is NOT conditional on y being greater or less than the censor threshold c</strong></p>

New cards

how do we know the autocorrelation of an AR model?

how do we know if an AR model displays weak dependence?

The coefficient on the lag is the sample autocorrelation

IF AR model has B<1, → weak dependence

if random walk or B>1, → no weak dependence

New cards

percentage change in time series formula

100Δln(Y_t)

= 100(ln(Y_t) - ln(Y_t-1))

this can be shown to equal 100*(change in Y_t formula)

= 100(ln(Y_t/Y_t-1))

= 100(ln(1+(Y_t - Y_t-1)/Y_t-1))

approx equal to 100(Y_t - Y_t-1)/Y_t-1)

Holds for small changes in Y_t

New cards

Autocorrelation formula

New cards

General formula for correlation

This is how the correlation coeff, r, is calculated

New cards

Sample autocorrelation formula

the denominator is simply var(yt) because we have assumed var(yt) and var(yt-1) are the same under stationarity.

<p>the denominator is simply var(yt) because we have assumed var(yt) and var(yt-1) are the same <strong>under stationarity.</strong></p>

New cards

Textbook definition of stationarity

Historical relationships can be generalised to the future because the distribution of Y_t is same over time.

Formally, the joint distribution of (Y_t+1, Y_t+2, …) does not depend on t

New cards

What is the formula for forecast error? And hence the formula for MSFE? RMSFE?

Forecast error = Y_T+1- Y-hat_T+1|T

MSFE = exp[(forecast error)²]

RMSFE = root of MSFE

New cards

How is the QLR test statistic found? Explain in detail

For every time period between [.15T, .85T] = [t1, t2],

Create a regression equation for each break date you would like to test (every period in the data). With a dummy variable =1 if t>break date, =0 if not.
yt = b0 + b1y_t-1 + a₀t + a₁t*y_t-1 + u
The null hypothesis is H0: alpha0 = alpha1 = 0. The alternative hypothesis is at least one of them is not equal zero. Repeat for every possible time, t.
Compute the joint f-statistic, and find the maximum one.

note: the QLR cannot be performed on the last 15% of data, but we can still check whether there is a structural break in the last 15% by comparing the POOS forecasts with the actual data

New cards

What are the three ways to calculate RMSFE?

Sample error method

Final prediction error method

Pseudo out-of-sample method

New cards

POOS method (Steps?) and hence write the formula for MSFE under POOS

Choose a number of out-of-sample observations, P (s=T-P) s will be your in-sample observations (Usually 10%/20% left of T is P)
Estimate regression y on s.
Calculate Y-hat for 1 period ahead of s, and find the forecast error.
Repeat for every period in P until you get many forecast errors

<ol><li><p>Choose a number of out-of-sample observations, P (s=T-P) s will be your in-sample observations (Usually 10%/20% left of T is P)</p></li><li><p>Estimate regression y on s. </p></li><li><p>Calculate Y-hat for 1 period ahead of s, and find the forecast error.</p></li><li><p>Repeat for every period in P until you get many forecast errors</p></li></ol><p></p><p></p>

New cards

Sample error method for finding MSFE

ASSUME: Stationarity

Write out the MSFE formula and expand for both
We have the oracle forecast variance and estimation error

New cards

Final prediction method for finding MSFE

Assume: Stationarity, homoskedasticity of errors

T = observations

p = number of lags

New cards

deterministic vs stochastic

stationairty implies…?

deterministic: trend has a determined pattern, a clear trend. Stationary → deterministic

stochastic: trend appears to be random

New cards

Dickey-Fuller test (What is it for? How do we do it?)

It is to determine whether a AR model is stochastic or not (unit root or not)

H₀: B₁ = 1 H_A: B₁ < 1

steps:

let a₁ = B₁-1, regress deltaY on Y_t-1
calculate the Dickey-Fuller statistic a₁/SE(a₁)
compare to critical values

note: we DO NOT RUN A T-TEST as OLS does not work

New cards

How do we check for structural breaks?

Use the QLR test for the centre 70%. For the last 15%, the POOS observations can be used to generate a forecast. Plot this against the true values and see if they deviate.

QLR test steps:

Create a regression equation for each break date you would like to test (every period in the data). With a dummy variable =1 if t>break date, =0 if not.
yt = b0 + b1y_t-1 + a₀t + a₁t*y_t-1 + u
The null hypothesis is H0: alpha0 = alpha1 = 0. The alternative hypothesis is at least one of them is not equal zero. Repeat for every possible time, t.
Compute the joint f-statistic, and find the maximum one.

<p>Use the QLR test for the centre 70%. For the last 15%, the POOS observations can be used to generate a forecast. Plot this against the true values and see if they deviate.</p><p></p><p>QLR test steps:</p><ol><li><p>Create a regression equation for each break date you would like to test (every period in the data). With a dummy variable =1 if t>break date, =0 if not.</p></li><li><p>yt = b0 + b1y<sub>t-1</sub> + a<sub>0</sub>t + a<sub>1</sub>t*y<sub>t-1</sub> + u</p></li><li><p>The null hypothesis is H0: alpha0 = alpha1 = 0. The alternative hypothesis is at least one of them is not equal zero. Repeat for every possible time, t.</p></li><li><p>Compute the joint f-statistic, and find the maximum one.</p></li></ol><p></p>

New cards

What is meant by “dynamic causal effect”

In a distributed lag model, a particular X_t will have effects that span not only for Y_t but also Y_t+1, Y_t+2, and so on…

The dynamic causal effect is the effect of X_t on all Y_t, Y_t+1, Y_t+2, …

i.e. the sequence B₁, B₂, B₃, …

New cards

Anytime a DL model includes a human element, it is likely that…

Strict exogeneity may be violated.

Because humans are forward-thinking. Changing future value of Y would likely influence a different pattern for X

New cards

Autocorrelation is a problem for the …

It does not affect…

But, autocorrelation won’t be a problem if we assume…

STANDARD ERRORS in OLS. This leads to problems for inference. That’s why we use HAC SEs.

It does not affect the accuracy/consistency/biasedness of the coefficients

If we assume homoskedastic standard errors, then autocorrelation is not a problem. BUT… this is often a terrible assumption which is why we use HAC SEs.

New cards

How do we find the MSPE (steps)

The mean-squared prediction error

Use M-fold Cross Prediction

This is the similar process as what we did for finding MSFE.

STEPS:

Determine m, the number of folds. Divide your sample N by m as equal as possible.
Using all but subsample m=1, estimate OLS and find the MSPE-hat for m=1
Repeat for m=2 (Do include m=1 here) and so on, until all the m has been done
Calculate the eqn in the image

<p>The mean-squared <strong>prediction </strong>error</p><ul><li><p>Use M-fold Cross Prediction</p></li></ul><p>This is the similar process as what we did for finding MSFE. </p><p></p><p>STEPS:</p><ol><li><p>Determine m, the number of folds. Divide your sample N by m as equal as possible.</p></li><li><p>Using all but subsample m=1, estimate OLS and find the MSPE-hat for m=1</p></li><li><p>Repeat for m=2 (Do include m=1 here) and so on, until all the m has been done</p></li><li><p>Calculate the eqn in the image</p></li></ol><p></p>

New cards

MSPE. What are the two ways to estimate it. Report the formula for the more ‘strict’ way to estimate it.

You can use m-fold cross prediction or use MSPE_OLS. MSPE_OLSis more strict because HOMOSKEDASTICITY IS REQUIRED.

where k is the number of regressors…

<p>You can use m-fold cross prediction or use MSPE<sub>OLS</sub>. MSPE<sub>OLS </sub>is more strict because <strong>HOMOSKEDASTICITY IS REQUIRED.</strong></p><p>where k is the number of regressors…</p>

New cards

OVB bias direction intuition?

Bias direction = sign of corr w/ x * sign of effect on y

New cards

checks for internal/external validity (5)

OVB
Linear model misspecification
Errors-in-variables bias (measurement error)
Sample selection bias
Simultaneous causality bias

New cards

OLS estimates (b1 and b0) for binary explanatory variable

B-hat1 = ybar1 - ybar0

B-hat0 = ybar0

New cards

average treatment effect

ATE = ybar(1) - ybar(0)

However, this is not observed. This requires us knowing what would happen to EVERY individual if they somehow were both in the control and experiment

Instead, we only observe y₀ and y₁. y₀bar is not the same as ybar(0) … because it is the ybar of the observations assigned 0.

New cards

Sample variance of u formula (under OLS)

New cards

If both the regressor and regressand are in the log-form, how do we interpret the coefficient on the regressor?

e.g. log(y) = b0 + b1(logx)

A 1% increase in x is associated with a b1% increase in y.

New cards

Variance of B₁-hat for MLR

The denominator is also equiv. to sum(r_i1-hat)²

<p>The denominator is also equiv. to sum(r<sub>i1</sub>-hat)²</p>

New cards

Derive the proof of OVB

New cards

SER/RMSE formula

SER means standard error for the regression.

= sqrt(SSR/n-df)

NOTE: The red highlighted should be n-df

ALSO: SSR=sum(u-hat)²

New cards

HAC / Newey West standard errors

????

New cards

F-statistic formula (there are three forms)

where q=number of restrictions (n-k-1) is the df

R² form of the formula

F = (R²_UR - R²_R)/q all divided by 1-R²_UR/df

and finally the joint significance of an unrestricted model can be tested if the R² is known. But the test must be on ALL the included variables.

F = R²/q / (1-R²)/df

<p>where q=number of restrictions (n-k-1) is the df</p><p></p><p>R² form of the formula</p><p>F = (R²<sub>UR</sub> - R²<sub>R</sub>)/q all divided by 1-R²<sub>UR</sub>/df</p><p></p><p>and finally the joint significance of an unrestricted model can be tested if the R² is known. But the test must be on ALL the included variables.</p><p>F = R²/q / (1-R²)/df</p>

New cards

consistency formula (how do we know if b1-hat is consistent)

show that plim(b1-hat) = b1

New cards

how to prove the asymptotic normal distribution?

show that the estimator is unbiased
show that the variance of the estimator is = …
then the numerator of the expression is ok (by 1) and the denominator of the expression is just 2. so it converges asymptotically to the normal dist (0,1)

New cards

What is the overidentification test for?
Write the steps.

It is to check the validity of an instrument, namely, that the IVs are exogenous

REQUIRED: more instruments than endogenous variables

H₀: all IVs are exogenous

H_A: H₀ is false

Regress u_i-hat (from 2nd stage of TSLS) on known exogenous variables (x₂, x₃, …)
Look at the generated R²
The test statistic is nR²~X²_#endog. Reject H₀ if it has a low p value

New cards

IV Estimator (TSLS) steps

regress x on z and all other exogenous variables

regress y on x-bar and all other exogenous variables

the b1 obtained from the 2nd stage regression is the iv estimator

100

New cards

relevance test

for iv estimation, see if the instrument is weak or if it actually works with the endogenous variable

run 1st stage of TSLS (regress x on z and other exog vars)
test the coefficient on z using t-test