Chapter 1-5 Notes

Panel Data: Introduction and Model Selection

Chapter 1: Introduction to Panel Data

  • Panel data combines cross-sectional and time series data.

    • Cross-sectional data: observations at a single point in time.

    • Time series data: observations over multiple time periods for a single unit.

    • Panel data: observations over multiple time periods for multiple units.

  • Advantages of panel data:

    • Accommodates questions that can't be addressed by cross-sectional data alone, such as changes and dynamics over time.

    • Allows for analysis of before-and-after scenarios.

    • Reduces heterogeneity.

    • Addresses unobserved heterogeneity between units (e.g., countries in ASEAN having different economic characteristics).

  • Types of panel data models:

    • Pooled OLS (Ordinary Least Squares): Assumes all observations are independent; ignores time and unit effects.

    • Fixed Effects: Each observation has its own intercept (e.g., \$\$\alpha1, \alpha2, \alpha_3\$\$) to account for individual heterogeneity. With 38 provinces, there would be 38 different intercepts.

    • Random Effects: Incorporates effects into the error term, acknowledging the presence of unobserved heterogeneity without explicitly modeling it.

Chapter 2: Pooled OLS and F-Test

  • Choosing the best model requires evaluating which assumptions hold.

  • Pooled Regression (OLS) is appropriate when individual intercepts are zero, implying homogeneity across units.

    • This is a strong assumption, rarely met in reality. Examples might include identical twins or rigid, homogenous goods.

    • In reality, people (e.g., Lisma, Rayhan) have different experiences and make different decisions.

  • Hypothesis for Pooled OLS: \$\$\alpha_i = 0 \text{ for all units } i \$\$.

  • F-Test:

    • Used to statistically determine if Pooled OLS is appropriate.

    • Tests whether all coefficients, including intercepts, are different from zero. Effectively, tests whether individual models (e.g., one for each of 100 respondents) have different coefficients.

    • If the null hypothesis (H0: no difference in coefficients) is not rejected, Pooled OLS is preferred.

Chapter 3: Fixed Effects vs. Random Effects and Hausman Test

  • If the F-test rejects the null hypothesis (i.e., coefficients are different from zero), either Fixed Effects (FE) or Random Effects (RE) is more appropriate than Pooled OLS.

  • Hausman Test:

    • Used to choose between FE and RE models.

    • Compares the coefficients from FE and RE models.

    • Null Hypothesis (H0): Coefficients from FE are equal to coefficients from RE.

    • If H0 is rejected, choose Fixed Effects.

    • If H0 is not rejected, choose Random Effects.

  • Fixed Effects as a