1/71
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
P-value
The closer the value is to 0, the more confident that beta is not equal to 0
R²
The amount of variance in Y that's explained by the variance in X
Homoskedasticity
Constant variance of errors across observations.
Serial Correlation
Different error terms are not correlated with each other
Conditional Mean Independence
Error term is independent of Independent variables (controlling for x2, x1 can be estimated)
F-test (SSR)
(SSRr - SSRu/q)/(SSRu/(n-k-1))
F test R2
((R2u-R2r)/q)/((1-R2u)/(n-k-1))
Elasticity
Percentage change in dependent variable from independent variable.
Consistency
As n goes to infinity, estimator approaches the true value
Root MSE
Biased estimator of the standard error of the population
Population of Interest
Population to which results are inferred
Internal Validity
Inferences are valid for the population studied
External Validity
Generalizability of results to other populations.
Internal validity requirements
Estimator is unbiased and consistent, the distribution of test statistics is correct, and hypothesis tests have desired significance
Simultaneous Causality
X causes Y and Y causes X
Panel Data
Data collected over time for the same subjects.
Exogeneity
Variable uncorrelated with error term in model.
Endogeneity
Variable correlated with error term
IV Regression
Instrumental Variable Regression; solves OVB, measurement errors, and simulataneous causality
Instrument Validity
Cov(z,x) != 0 (relevance)
Cov(z,u) = 0 (independence)
Z isn't part of the model (Z doesn't explain y)
J Test
Can only be done in overidentification!! (m>k) Regress TSLS residuals on the Zs and Ws, then do an F test on significance of Zs.
J test distribution/statistic
J=mF ~ Chi squared dist
Maximum Likelihood
Estimation method maximizing probability of observed data.
Hausman Test
Ho: Xi exogenous, H1: Xi endogenous, see if t-test is significant
Pseudo R²
Alternative measure of fit for non-linear models.
Population studied
Population that was sampled
ESS
Sum (Yhat-Ybar)^2
TSS
Sum(Yi-Ybar)^2
SSR
Sum uhati^2
Efficiency
Smallest variance
SER
sqrt(SSR/(n-(k+1))
R^2
ESS/TSS = 1 - SSR/TSS
LSA 1
E(ui|Xi) = 0
LSA 2
(Xi, Yi) iid
LSA 3
Large outliers unlikely (finite 4th moments)
LSA 4
Var (ei | xi) = variance, homoskedasticity
LSA 5
Cov(ei, ej | xi, xj) = 0, no serial correlation (error terms are independent of other independent variables)
Type 1 error
Reject a true null
Type 2 error
Accept a false null
var(ax)
a^2 var(x)
Adjusted R2
1-((n-1/n-k-1)*(SSR/TSS))
Adjust R2 alternate formula
1- (s^2u/s^2y)
Cov(X+c, Y)
Cov(X,Y)
Cov(X+Y,Z)
Cov (X,Z) + Cov (Y,Z)
Problem with OVB
Violates LSA 1, leads to biased and inconsistent results
Positive bias
β > 0 & cov (x1, x2) >0 (same direction)
Negative Bias
β > 0 & cov (x1, x2) < 0
OVB formula
Bhat -> B1 + corr(x1,u) * Su/Sx (population correlation/std dev)
F test distribution
qF ~ chisquared(q) = qF (q, infinity)
F(a,b) in stata
F(df, n-df)
What do k and q mean?
k = number of regressors without the constant, k+1 parameters
q = number of restrictions in Ho
Requirements for OVB
Corr(omitted var, included var) != 0 , and omitted var is a determinant of Y
Conditional mean independence
Conditional expectation of ui given X1i and X2i does not depend on X1i. Controlling for x2i, x1i can be treated as random
Log-lin
1 unit inc in X = β *100% inc in Y
Lin-log
1% inc in X = .01* β inc in Y
Log-log
1% inc in X = β% inc in Y (elasticity)
E(rnormal())
0, in the normal distribution X is 0 and variance is 1
Threats to internal validity
OVB, functional form misspecification, measurement error, sample selection, simultaneous casusality
Other solutions than IV for OVB
Panel data deals with OVB, and RCTs can deal with OVB and simultaneous causality
Construct validity
How valid a test is according to theory
To see if an instrument is relevant
With q=1, look in the first regression and check if F=t^2 of the instrument>10
IV assumptions
E(ui | W) = 0
(X, W, Y, Z) iid
Large outliers unlikely
Valid instruments availble
Pros of linear probability model
Coefficients can be analyzed as normal, good linear approximation, unbiased & consistent
Cons of linear probability model
Probabilities aren't necessarily between (0,1), heteroskedastic, and error term can't be approximated by the normal distribution
Logit and probit errors
Homoskedastic
ML estimator
Has a consistent, asymptotically normal distribution
Best Linear Unbiased Estimator (BLUE)
LS is BLUE under the 5 gauss markov assumptions
Asymptotically normal
Distribution of estimators approaches normality as sample size increases.
Unbiased
E(estimator|X) = estimator
When analyzing the effect of coefficients
Remember to take the derivative, and always include CP!!!1
q = df true or false
true!
Unrestricted model
The model with all coefficients