Extensions of the Two-Variable Linear Regression Model — Detailed Study Notes

Regression Through the Origin (No‐Intercept Models)

  • Definition

    • Intercept parameter \beta_1 is constrained to be 0.

    • Estimated equation: \hat Yi = \hat\beta2 X_i.

  • Rationale / When used

    • Theoretical considerations dictate that Y=0 when X=0.

    • Example: risk–premium version of the Capital Asset Pricing Model (CAPM).

  • CAPM risk-premium form

    • E(Ri) - rf = \betai\,[E(Rm) - r_f] (6.1.2)

    • E(R_i) = expected return on security i

    • E(R_m) = expected return on market portfolio (e.g., S&P 500)

    • r_f = risk-free rate (≈ 90-day T-bill)

    • \beta_i = systematic-risk (volatility) coefficient

      • \betai > 1 → aggressive; \betai < 1 → defensive.

  • Empirical market model (allowing non-zero \alpha_i)

    • Ri-rf = \alphai + \betai (Rm - rf) + u_i (6.1.4)

    • If CAPM holds, \alpha_i = 0 ⇒ regression through origin.

  • OLS estimation (origin model)

    • Population regression function (PRF): Yi = \beta2 Xi + ui.

    • Sample regression function (SRF): \hat Yi = \hat\beta2 X_i.

    • Estimator and variance

    • \hat\beta2 = \dfrac{\sum Xi Yi}{\sum Xi^2} (6.1.5)

    • \operatorname{Var}(\hat\beta2) = \dfrac{\sigma^2}{\sum Xi^2}.

  • Practical drawbacks

    1. Residuals no longer guaranteed to sum to 0 → diagnostic issues.

    2. R^2 can be negative! Use row R^2 (non-mean-corrected) ensuring 0\le R^2_{row}\le1.

  • Use‐cases in Economics

    • Friedman’s permanent-income hypothesis (permanent C ∝ permanent Y).

    • Cost theory: variable cost ∝ output.

    • Monetarist models: inflation ∝ growth of money supply.

Illustration: Excess‐Return Data (Example 6.1)

  • Data: 240 UK monthly observations (1980–1999) on sector index excess return Yt and market excess return Xt.

  • Regression through origin

    • Yt = 1.1555\,Xt, SE(\hat\beta_2)=0.0744 ⇒ t=15.53 (p≈0).

    • R^2=0.5003, SER = 5.549, DW ≈ 1.97.

  • Regression with intercept

    • Yt = -0.4475 + 1.1711 Xt.

    • Intercept not significant (p≈0.219) ⇒ origin model plausible.

Rescaling & Units of Measurement (Section 6.2)

  • Suppose original model: Yi = \beta1 + \beta2 Xi + u_i.

  • Define scaled variables

    • Yi^* = w1 Yi, Xi^* = w2 Xi (6.2.2–6.2.3)

    • ui^* = w1 u_i.

  • Regression in scaled units

    • Yi^* = \beta1^* + \beta2^* Xi^* + u_i^* (6.2.4)

  • Relationships between original and scaled estimates (6.2.15–6.2.20)

    • \hat\beta2^* = \dfrac{w1}{w2} \hat\beta2

    • \hat\beta1^* = w1 \hat\beta_1

    • \hat\sigma^{*2} = w_1^2 \hat\sigma^2

    • \operatorname{Var}(\hat\beta1^*) = w1^2 \operatorname{Var}(\hat\beta_1)

    • \operatorname{Var}(\hat\beta2^*) = \Bigl(\tfrac{w1}{w2}\Bigr)^2 \operatorname{Var}(\hat\beta2)

    • R^2 is unit-invariant.

  • Empirical confirmation (GPDI & GDP, 1990–2005)

    1. Both in billions: \text{GPDI} = -926.09 + 0.2535\,\text{GDP} (6.2.21).

    2. Both in millions: intercept & SE ×1000; slope unchanged (6.2.22).

    3. Y billions, X millions: slope ÷1000 (6.2.23).

    4. Y millions, X billions: slope ×1000 (6.2.24).

    5. R^2=0.9648 throughout.

Regression on Standardized Variables ("Beta Regression")

  • Standardization

    • Yi^{std}=\dfrac{Yi-\bar Y}{sY}, Xi^{std}=\dfrac{Xi-\bar X}{sX}.

    • Means = 0, SDs = 1.

  • Standardized regression

    • Yi^{std}=\beta2^* Xi^{std}+ui^{std} (intercept = 0).

  • Interpretation

    • \beta_2^* = change (in SDs) of Y for one SD increase in X.

    • Facilitates comparison when variables measured in different units.

Functional Forms of Two-Variable Models

(Remember: “linear model” means linear in parameters, not necessarily in variables.)

1. Log–Log (Double-Log) Model

  • Specification: \ln Y = \beta1 + \beta2 \ln X + u.

  • \beta2 measures elasticity: \beta2 = \dfrac{\partial \ln Y}{\partial \ln X}=\dfrac{\partial Y}{\partial X}\,\dfrac{X}{Y}.

  • Example (6.5.5): Durable‐goods expenditure

    • \ln(\text{EXPDUR}) = -7.5417 + 1.6266\,\ln(\text{PCEX}).

    • Elasticity ≈ 1.63 → 1 % ↑ in PCEX → 1.63 % ↑ in EXPDUR.

    • R^2=0.9695, both coefficients highly significant.

2. Semi-Log Models

a. Log–Lin (Growth) Model
  • \ln Yt = \beta1 + \beta2 t + ut.

  • Interpretation: \beta_2 ≈ constant proportionate growth rate per time period (≈ % change).

  • Example (6.6.8): Services expenditure

    • \ln(\text{EXS}_t)=8.3226+0.00705 t ⇒ 0.705 % quarterly ≈ 2.82 % annual growth.

    • R^2=0.9919.

b. Lin–Log Model
  • Y = \beta1 + \beta2 \ln X + u.

  • Interpretation (6.6.12–6.6.13): \beta_2 = \dfrac{\Delta Y}{\Delta \ln X}=\dfrac{\Delta Y}{\Delta X/X} ⇒ absolute change in Y for a given % change in X.

  • Example (6.6.14): Indian food expenditure

    • \text{FoodExp}i = -1283.912 + 257.270\,\ln(\text{TotalExp}i).

    • 1 % ↑ in TotalExpenditure → ≈ 2.57 rupee ↑ in food spending.

    • R^2=0.3769.

3. Reciprocal Model

  • Y = \beta1 + \beta2 \dfrac{1}{X} + u (6.7.1).

  • Linear in parameters; nonlinear in variable.

  • As X \to \infty, second term → 0, so Y \to \beta_1 (asymptote).

  • Example (6.7.2): Child mortality vs per-capita GNP

    • \text{CM}i = 81.794 + 27{,}237.17\,(1/\text{PCGNP}i).

    • As PCGNP ↑, CM → 82 deaths per 1000.

    • R^2=0.4590, coefficients highly significant.

4. Log-Reciprocal Model

  • \ln Y = \beta1 - \beta2 \dfrac{1}{X} + u.

  • Slope \dfrac{\partial Y}{\partial X} = +\beta2 \dfrac{1}{X^2} e^{\beta1-\beta_2/X};
    elasticity varies with X and Y (see summary table, slide 31).

Comparative Interpretation of Coefficients (Slide 28)

  • Example using salary & sales:

    • Linear: \beta_2=0.0155 ⇒ salary ↑ \$155 per $1 m sales.

    • Log-Log: \beta_2=0.257 ⇒ 1 % sales ↑ → 0.25 % salary ↑.

    • Log-Lin: \beta_2=1.5\times10^{-5} ⇒ 1 m sales ↑ → 0.0015 % salary ↑.

    • Lin-Log: \beta_2=262.9 ⇒ 1 % sales ↑ → \$2.629k salary ↑.

Choosing an Appropriate Functional Form (Slide 32)

  1. Economic theory guidance (e.g., Phillips curve suggests nonlinear trade-off).

  2. Desired slope interpretation (absolute vs relative changes vs elasticities).

  3. Signs & magnitudes should align with a-priori expectations.

  4. Multiple forms may fit; compare via theory, statistical diagnostics, and interpretability, not just R^2.

  5. Do not over-emphasize goodness-of-fit; a higher R^2 does not guarantee correct specification.

  6. When ambiguous, apply Box–Cox transformations to empirically choose power/exponential forms.

Practical & Ethical Considerations

  • Mis-specifying functional form can bias policy conclusions (e.g., elasticity over-/under-estimation).

  • Scaling and standardization choices alter numerical values; always report original units for clarity.

  • When forcing regression through origin, ensure theoretical justification; otherwise risk omitted-variable bias.

  • For income/poverty studies, reciprocal models may exaggerate asymptotes—interpret with socioeconomic context.

Connections to Previous Material

  • Builds on OLS assumptions (Gauss–Markov) discussed earlier: unbiasedness, variance formulas, hypothesis tests.

  • Extends two-variable linear regression by:

    • Dropping intercept (origin models).

    • Applying linear transformations (scaling, standardization).

    • Allowing nonlinear variable transformations while retaining linearity in parameters.

  • Same inferential machinery (t, F, R^2, DW) applies after transformation, but interpretation changes.

Summary Checklist for Exam

✓ Can you derive \hat\beta_2 for a no-intercept model?
✓ Explain why R^2 may be negative and how Row-R^2 fixes it.
✓ Convert coefficients when data are re-expressed (billions ↔ millions).
✓ Interpret standardized-regression slope.
✓ Compute and interpret elasticity from log-log model.
✓ Distinguish growth (log-lin) vs Engel-type (lin-log) interpretations.
✓ Recognize asymptote property of reciprocal models.
✓ Choose functional form based on theory + diagnostics rather than maximal R^2 .