1/39
Vocabulary flashcards covering key terms and concepts from Lecture 4 on the simple linear regression model with one regressor.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Econometrics
The scientific field that uses statistical methods to quantify economic relationships and test causal hypotheses.
Simple Linear Model (SLM)
Regression model Y = β0 + β1X + U with one regressor X, an intercept β0, slope β1, and error term U such that E(U|X)=0.
Independent and Identically Distributed (i.i.d.) Sample
A dataset {(Xi, Yi)} where each pair follows the same distribution as (X,Y) and observations are mutually independent.
Ordinary Least Squares (OLS)
Estimation technique that chooses β̂0 and β̂1 to minimise the sum of squared residuals.
OLS Slope Estimator (β̂1)
β̂1 = Σ(yi−ȳ)(xi−x̄) / Σ(xi−x̄)².
OLS Intercept Estimator (β̂0)
β̂0 = ȳ − β̂1x̄, where ȳ and x̄ are sample means of Y and X.
Population Slope (β1)
β1 = Cov(Y,X) / Var(X), the true linear relationship between X and Y in the population.
Population Intercept (β0)
β0 = E(Y) − β1E(X), the expected value of Y when X=0.
Residual (Ûi)
Difference between observed and fitted value: Ûi = Yi − β̂0 − β̂1Xi.
Estimator of σ² (σ̂²)
σ̂² = (1/(n−2)) Σ(yi − β̂0 − β̂1xi)²; unbiased under homoskedasticity.
Homoskedasticity (cHom)
Assumption that Var(U|X)=σ², i.e., constant error variance across all X.
Gauss-Markov Theorem
States that OLS gives the Best Linear Unbiased Estimator (BLUE) of β0 and β1 under the classical assumptions.
Conditional Unbiasedness
Property that E(β̂j|X)=βj for j=0,1; likewise E(σ̂²|X)=σ² under homoskedasticity.
Estimated Standard Error of β̂1
SE(β̂1)=√[σ̂² / Σ(xi−x̄)²].
Estimated Standard Error of β̂0
SE(β̂0)=√[ σ̂²(1/n + x̄² / Σ(xi−x̄)² ) ].
Predicted Value (ŷi)
ŷi = β̂0 + β̂1xi, the fitted value of Y for observation i.
Total Sum of Squares (TSS)
Σ(yi−ȳ)², measures total variation in Y.
Explained Sum of Squares (ESS)
Σ(ŷi−ȳ)², variation in Y explained by the regression.
R-squared (R²)
R² = ESS/TSS, proportion of variance in Y explained by X; lies between 0 and 1.
Root Mean Squared Error (RMSE)
Square root of the mean squared residuals; often reported as ‘Root MSE’ in software output.
Normality Assumption (N)
Assumption that U|X ~ N(0,σ²), enabling exact t-tests and confidence intervals.
t-distribution with (n−2) d.f.
Sampling distribution of (β̂j−βj)/SE(β̂j) under normality when two parameters are estimated.
Two-Sided t-Test for Slope
Test H0: β1=β1; reject if |(β̂1−β1)/SE(β̂1)| > t_{n−2,1−α/2}.
One-Sided t-Test for Slope
Test H0: β1≤β1* (or ≥) using critical value t{n−2,1−α} (or t{n−2,α}).
t-Test for Intercept
Analogous procedure using statistic (β̂0−β0*)/SE(β̂0).
Significance Level (α)
Probability of rejecting a true null hypothesis; common choices are 0.05 or 0.01.
Critical Value
Threshold from the t-distribution beyond which the null hypothesis is rejected.
Confidence Interval for β1
β̂1 ± t_{n−2,1−α/2}·SE(β̂1) gives a (1−α) coverage probability.
Confidence Interval for β0
β̂0 ± t_{n−2,1−α/2}·SE(β̂0).
Degrees of Freedom
n−k where k is the number of estimated parameters; equals n−2 in simple regression.
Scatterplot
Graph plotting observed (Xi,Yi) pairs; used to visualise the data and fitted line.
Regression Line
Graph of ŷ = β̂0 + β̂1X superimposed on a scatterplot.
F-Statistic
Overall test statistic for model significance; in simple regression F = t² for β̂1.
STATA
Statistical software package used to run regressions and produce output including coefficients, SEs, R², etc.
Location Model
Special case Y = μ + U; no regressors except intercept.
Sample Variance of X (Σ(xi−x̄)²)
Denominator in OLS slope and variance formulas; measures spread of X.
Coverage Probability
The probability that a confidence interval contains the true parameter value, e.g., 95% when α=0.05.
Covariance
Cov(Y,X) = E[(Y−EY)(X−EX)]; measures joint variability and appears in β1.
Variance
Var(X)=E[(X−EX)²]; denominator in β1 and key in standard error formulas.
Reproductive Property of Normal Distribution
Linear combinations of jointly normal variables are themselves normal; used to derive distribution of β̂.