Adv Econometrics III

5.0(1)

Studied by 12 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/71

Earn XP

Description and Tags

Key concepts & common STATA commands.

Economics

Econometrics

Statistics

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

72 Terms

New cards

random variable

a numerical summary of a random outcome

New cards

outcome

the mutually exclusive result of a random process

New cards

variable

a measurable characteristic of a population

New cards

sample space

the set of all possible outcomes of a random process

New cards

event

a subset of the sample space

New cards

estimator

the function of the data in the sample derived to infer the estimand

New cards

estimand

the true value in the observable population which is to be estimated

New cards

target/structural parameter

the specific unknown population parameter that is to be estimated

New cards

central limit theorem

when N is sufficiently large, the distribution of the estimated mean becomes more normal. Therefore, the variance (σ²) becomes more predictable.

New cards

Gauss-Markov OLS Assumptions

Exogeneity
No multicolinearity
Linear relationship between dependent var & independent var
Homoskedasticity
No autocorrelation
Independent & normally distributed error term

New cards

properties of OLS under Gauss-Markov assumptions

B-L-U-E

Best linear unbiased estimator

New cards

exogeneity

the x-variables and the error term are not correlated: E(εi | X) = 0. Therefore, neither the error term nor the dependent variable (Y) influence the explanatory variables (X) since they are determined outside of the model.

New cards

multicollinearity

a breach in Gauss-Markov correlation between ≥2 explanatory variables, probably because they measure a similar trait

New cards

homoskedasticity

the variance (σ²) of the error term (ε) is constant throughout the sample. Therefore, the dispersion of residuals is similar for all X. This can be visually detected through a rectangle-shaped mass of residuals in a scatter plot of residuals (the absence of change in the residuals as X changes)

New cards

heteroskedasticity

The variance (σ²) of the error term (ε) is not constant throughout the sample. Therefore, the dispersion of residuals is dissimilar for all X. This violates Gauss-Markov assumptions of OLS, meaning that the standard errors are no longer efficient; however, the sample is still unbiased. In terms of BLUE, the Heteroskedasticity sample is no longer efficient (E). This can be visually detected through a conical or trumpet-shaped pattern in a scatter plot of the residuals.

New cards

autocorrelation / serial correlation

the correlative relationship between an independent variable and its own past values. This violation of OLS assumptions often occurs in time series data.

New cards

endogeneity

correlation between explanatory vars and the error term, violating OLS assumptions: E(εi | X) ≠ 0. This creates a bilateral causal relationship between the X and Y variables. This is best detected through the Hausman test in STATA w/ the command

New cards

residuals

the differences between observed (actual) values and the estimated values predicted by the model. This is shown in SSR

New cards

grounds to reject the null hypothesis (Ho) and propose the alternative hypothesis (Ha) / statistical significance

p-value < critical value

New cards

insufficient grounds to reject the null hypothesis / statistical insignificance

p-value > critical value

New cards

p-value

the probability of observing a z-stat, t-stat, F-stat, etc. with an absolute value ≥ the observed results (more extreme than the observed results)

New cards

Triple S method of analysing variable coefficients

sign, size & significance of the a variable’s estimated coefficient

New cards

dummy variable

a numerical var expressed as 0 or 1 to represent categorical data, often gender, race, union membership, etc.

New cards

elasticity

the % change of the dependent var due to 1% change in the independent var

New cards

linear-linear model (Y = f[X])

change Y = beta change X

New cards

linear-log model (Y=f[logX])

change Y = beta/100 % change X

New cards

log-linear model (logY = f[X])

% change Y = (100)beta change X

New cards

log-log model (logY = f[logX])

% change Y = beta% change X

New cards

internal validity

a regression that successfully yields inferences applicable to the chosen population

New cards

external validity

a regression whose inferences made from a sample can also be applied to other populations

New cards

Variance Inflation Factor (VIF)

a method of identifying multicollinearity by quantifying how much correlation between predictor variables inflates the variance of a regression coefficient. This index = 1/(1-R²), multilolinearity is generally of concern if (index >10). To run this in STATA, use command VIF.

New cards

Breusch-Pagan test for heteroskedasticity

Ho: constant variance/hetsked (σ1 = σ2, etc.)

Ha: inconstant variance/homosked (σ1 ≠ σ2, etc).

To run this test in Stata, use the command ESTAT HETTEST

New cards

robust standard errors

standard errors adjusted for heteroskedastiicity. add Stata command , ROBUST following the last independent variable in a regression code line

New cards

standard errors

= variance / square root(# of observations)

New cards

Type I neoclassical measurement error

the error is uncorrelated with the true value of the variable (eg: independent inaccuracies in reporting one’s weight)

New cards

Type II neoclassical measurement error

the error is correlated with the true-value or with other variables (eg: many observations - often self-reported - intentionally misrepresent a characteristic like income, education)

New cards

conditions for instrumented regression

relevance: the instrument must correlate with the problematic endogenous variable.
exclusive restriction: the instrument only affects the outcome through the endogenous x-variable

New cards

rule of thumb for weak instrument identification

F-stat < 10 for a significance test for

Use the STATA command ESTAT FIRSTSTAGE

New cards

Hausman test

A test to determine if the estimator is consistent & efficient (adheres to BLUE)

Ho: the regressor is exogenous (E(Xiεi) = 0)

Ha: the regressor is endogenous (E(Xiεi) ≠ 0)

New cards

linear probability model

a OLS model following the binomial distribution that uses limited dependent variables. These models can suffer from issues like predicted probabilities outside the 0-1 range and heteroskedasticity.

New cards

probit model

the cumulative distribution function of independent variables which models the probability of an event’s occurrence. This model follows the standard normal distribution, hence its coefficients are interpreted as z-scores for the Probit Index. Use STATA command PROBIT.

New cards

logit model

the log of the probability of an event’s occurrence. this model follows the logistic distribution and is interpreted as the “odds” of an event happening. Use STATA command LOGIT

New cards

latent variable

a variable that cannot be observed, but can be inferred from other observable variables (eg: intelligence as measured through a test score). This type of variable often appears in logit or probit models

New cards

maximum likelihood estimation

estimating the parameters of an assumed probability distribution based on some observed data to maximise a likelihood function so that, under the assumed statistical model, the observed data is most probable.

New cards

MARGINS command

finds marginal effects

New cards

multinomial regressions

regressions for categorical data with no order/ranking that are calculated according to maximum likelihood estimation. This method assumes the independence of irrelevant alternatives and predicts the log odds of an observation being classified as a respective category.

New cards

cross-sectional data

data that provides a ‘snapshot’ of multiple observations at a given point in time (time is constant)

New cards

time-series data

data for only one variable collected at successive, recurring intervals to capture change over time

New cards

panel data

a combination of cross-sectional data and time series data

New cards

Chow Test for structural change

A statistical test to determine if the coefficients in two separate regression models are equal, often used in DiD regressions to examine changes between two groups and/or changes before/after an intervention.

Ho: coefficients are the same for every Y (no structural break exists)

Ha: coefficients are different for every Y (a structural break exists)

New cards

Difference-in-Diffferences

Causal estimator method of using control and treatments groups to examine trends in 2 groups pre- and post-intervention. This method addresses biases from pre-existing differences between the two groups and omitting time trends that would have occurred regardless of the intervention.

This method assumes:

parallel trend
exogeneity
conditional independence

New cards

balanced panel data

panel data with an equal number of observations in each cross-section and time period

New cards

unbalanced panel data

panel data with an unequal number of observations in each cross-section and time period

New cards

fixed effects (FE) Model

model for panel data that assumes characteristics to be fixed over time and correlated with observable variables

New cards

random effects (RE) model

a model for panel data that assumes unmeasurable characteristics in each observation to be random and uncorrelated with any of the model’s random variables

New cards

STATA command XTSET

organises panel data properly

New cards

error term (ε)

the unexplained portion of a dependent variable's variance that's not accounted for by the independent variables in a model

New cards

conditions for proper instrumented regression

instrument correlates with the endogenous variable
the instrument does not correlate with the error term

If these conditions are met, the instrument will only effect the endogenous variable

New cards

1st stage insstrumentation

New cards

2nd stage instrumentation

New cards

methods of addressing Heteroskedasticity

drop the hetsked variable
cluster the standard error
- subdivide the hetsked variable into several new variables based on common traits and their residuals
use the log form of the hetsked variable
use hetsked robust standard errors (STATA command , ROBUST at the end of a regression command

New cards

taking the log of an explanatory variable (ln[X])

a strategy to address heteroskedasticity in an explanatory variable by stabilising its variance (σ²). If this is executed, the variable must now be interpreted as % change

New cards

F-test / joint F-test

A statistical test assessing the overall significance of a variable’s coefficient (B) to hence determine whether they should be included in a final regression model. This test can also compare the goodness-of-fit of several different suggested models. To run this test in STATA, use the command TEST X_{1 =}X₂ after a regression output.

Ho: B₂ = B₃ etc. = 0 (the included variables are not significantly different to 0 and therefore do not statistically explain change in Y)

Ha: B₂ ≠ B₃ etc. ≠ 0 (the included variables are significantly different from 0 and therefore do statistically explain change in Y)

New cards

order condition for instrumented regression

there is exactly one instrument for every endogenous variable

New cards

over-identication

there exist more instruments than endogenous variables

New cards

under-identification

there are insufficient instruments compared to endogenous variables

New cards

2-stage least squares (2sls)

New cards

long & narrow panel data

panel data w/ long time dimension & narrow range of subjects

New cards

short & wide panel data

panel data w/ short time dimension & wide range of subjects

New cards

long & wide panel data

panel data w/ long time dimension & wide range of subjects

New cards

short & narrow panel data

panel data w/ short time dimension & narrow range of subjects

New cards

heterogeneity bias

bias resulting from the omission of the unobserved fixed effect