Heteroskedasticity + Time Series 2024 Review Questions

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/25

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

26 Terms

1
New cards

Which assumption is violated when heteroskedasticity occurs in a regression model? Explain how heteroskedasticity affects the variance of the errors and describe the relationship between the dispersion of the residuals and the values of the independent variables.

Heteroskedasticity is a violation of the constant error variance (homoscedasticity) assumption. It means the variance of the errors (residuals) in a model is not constant across all levels of the independent variable(s). The spread or dispersion of the residuals varies as a function of the values of the independent variables.

2
New cards

Can you provide an example of a real-world situation where heteroskedasticity might occur in a regression model, and explain why it happens in that context?

Income-expenditude data. Inagine you’re analysing household income and how it realtes to spending on luxury goods. In this case, wealtheir households tend to have more variable spending patterns, while lower-income households in spending also increases, leading to heteroskedasticity. The residuals (errors) from a regression model predicting spending based on income would show increasing dispersion as income rises, indicating that the variance of the errors in not constant across all levels of income.

3
New cards

State how heteroskedasticity affects the efficiency of the coefficient estimates (𝛽)

They are no longer efficient, meaning they do not have the minimum variance among the class of linear unbiased estimators.

4
New cards

State how heteroskedasticity affects the unbiasedness and consistency of coefficient estimates (𝛽).

The estimated coefficients β remain unbiased and consistent in the presence of heteroskedasticity.

5
New cards

Briefly, describe how heteroskedasticity affects the estimation formula for the variance of 𝛽 and its implications for hypothesis testing and confidence intervals.

Heteroskedasticity leads to incorrect estiamtes of the variance of β, making standard errors unreliable. This inaccuracy affects hypothesis testing and confidence intervals. 

6
New cards

Discuss one way in which you could test for heteroskedasticity using a graph

You could plot the residuals against the fitted values of the model and look for spread or a pattern

7
New cards

Consider the following regression model: Y𝑖 = 𝛽0 + 𝛽1 𝑥1𝑖 + 𝛽2 𝑥2𝑖 + 𝑢𝑖 where 𝑢𝑖 is the error term. You suspect that heteroskedasticity may be present in the model. Explain the steps involved in conducting the Breusch-Pagan test for heteroskedasticity using the LM approach. [7]

  • Estimate the original Regression Model: Regress Yi on X1i and X2i to obtain the residuals Ui

  • Calculate the Squared Residuals: Compute the squared residuals from the original regression

  • Auxiliary Regression: Regress u-squared on the independent variables x1i and x2i

  • Compute the Test Statistic: The test statistics for the BP tests as BP = nR-squared where R-squared is from the auxiliary regression and n is the no of observations

  • Decision Rule: Compare the BP statistic with the critical value from the chi-squared distribution with k degrees of freedom, where k is the number of independent varibales in the auxiliary regression. If the BP stat exceeds the critical value, reject the null hypothesis of homoskedasticity

8
New cards

1.11 You are given the following model: 𝑌𝑖 = 𝛽0 + 𝛽1𝑋1𝑖 + 𝛽2𝑋2𝑖 + 𝛽3𝑋3𝑖 + 𝑢𝑖 Outline the steps involved in conducting the White LM test. [8]

  • Estimate the original Model: Estimate the housing price model using OLS to obtain the residuals

  • Create the Auxiliary Regression: Regress the squared residuals on the original independent variables, their squares, and their cross-products. Alternatively, regress the squared residuals on Yi and Yi-squared

  • Calculate the Test Statistic: The White LM test statistic is calculated as nxR-squared, where n is the samples size and R-squared is the coefficient of determination from the auxiliary regression

  • Compare to the Critical Value: Compare the test statistics to the critical value from the chi-square distribution with degrees of freedom equal to the number of regressors in the auxiliary regression (excluding the intercept). If the test statistic is greater than the critical value, reject the null hypothesis of homoskedasticity

9
New cards

You are given the following model: 𝑌𝑖 = 𝛽0 + 𝛽1𝑋1𝑖 + 𝛽2𝑋2𝑖 + 𝛽3𝑋3𝑖 + 𝑢𝑖 Explain the purpose of the White LM test in the context of this model.

The White LM test is used to detect the presence of heteroskedasticity in the error terms of a regression model. In the context of this model, the test chekcs whether variance of the error terms Ui, depends on one or more of the independent variables X1i, X2i, X3i) or their combinations.

10
New cards

Suppose you perform the White LM test and find that the test statistic is significant. What does this imply about the error term in the model?

If the test statistic is significant, this implies that the error term ui in the model exhibits heteroskedasticity. Alternatively, this means that the variance of the error term is not constant across observations, which violates one of the key assumptions of OLS.

11
New cards

What are the potential consequences for your OLS estimates if heteroskedasticity is present?

The standard errors of the OLS estimates may be biased, leading to unreliable hypothesis tests and confidence intervals. Alternatively, if heteroskedasticity is present, the standard OLS assumptions are violated, which can lead to inefficient and biased estimates of the coefficients’ standard errors, affecting hypothesis tests and confidence intervals.

12
New cards

Suggest any two potential remedies if heteroskedasticity is detected in this model.

  1. Use Robust Standard Errors: Compute robust standard errors that are consistent in the presence of heteroskedasticity. These standard errors correct for the bias in the variance estimation without altering the coefficient estimates.

  2. Transformation of Variables: Transform the dependent variable to stabilise the variance of the errors

  3. Weighted Least Squares (WLS): Apply WLS to give different weights to observations based on the variance of the errors, thereby correcting for heteroskedasiticity

  4. Model Respecification: Consider adding interaction terms or higher-order terms to the model, if the heteroskedasticity arises from model misspecificaiton

13
New cards

Research the Goldfeld-Quandt test for heteroskedasticity. What steps are required to implement this test?

  1. Sort the Data: Begin by sorting the data based on the independent variable(s) that are suspected to cause heteroskedasiticity. This ensures that the error variances if they change, are easier to detect across different levels of the independent variables

  2. Divide the Data into Two subsamples: Split the data into two subgroups. The idea is to compare the variance of the residuals in these two groups to detect any differences. Typically, the middle observations are excluded to reduce potential bias, although this can vary depending on the size of the data.

  3. Run two separate regressions; Perform separate ordinary least squares regressions on each subgroup

  4. Compute the variance of the residuals for each subgroup: For each group, calculate the sum of Squared residuals (SSR)

  5. Compute the test statistic: The Goldfeld-Quandt test stat is the ratio of the residual sum of squares from the two regressions: F = SSRsmaller/SSRlarger

  6. Compare the critical value: The test statistic follows an F-distribution. Compare the computed stat to the critical value. If the F-stat is significantly large, reject the null hypothesis of homoscedasticity in favour of heteroskedasiticty.

14
New cards

Describe and illustrate the process of implementing a weighted least squares (WLS) regression when the variance of residuals is related to the explanatory variable 𝑋𝑋1. Specifically, explain how you would determine the appropriate weights for the WLS regression in this case. 

Determine Weights: Given the variance of the residuals is hypothesised to be a function, assuming that the variance increases with the function, we use the inverse of the square root of the function as weights. 

Perform WLS Regression: Transform the original model by multiplying both sides of the regression equation by the weight.

The WLS regression is then performed on the transformed model.

15
New cards

State how the use of WLS affects the unbiasedness and efficiency of the coefficient estimates, and the implications for hypothesis testing.

Unbiasedness and Efficiency of Coefficient Estimates: The coefficient estimates in WLS are unbiased. WLS provides more efficient estimates of the regression coefficients compared to OLS when heteroskedasticity is present.

Implications for hypothesis tests: With WLS, the hypothesis tests are more reliable because the standard errors of the coefficient estimates are adjusted for heteroskedasticity.

16
New cards

Define a stochastic process.

A stochastic process is a collection of random variables indexed by time. It describes how a variable or set of variables changes randomly over time.

17
New cards

Compare and contrast a static model and an infinite distributed lag model in econometrics i.e. define each model and discuss how they handle the effects of independent variables on the dependent variable over time.

  • A static model examines the relationship between independent variables and a dependent variable at a single point in time. Alternatively, in a static model, the relationship is assumed to be instantaneous.

  • Assumes that the effect of an independent variable on the dependent variable is immediate and does not persist over time

  • An infinite distributed lag model accounts for the effects of independent variables on the dependent variable over multiple time periods. Alternatively, an infinite distributed lag model includes multiple lagged values of the independent variable, allowing it to capture the dynamic relationship and the cumulative effect of past values on the current value of the dependent variable.

  • Assumes that the effect of an independent variable can persist over time and includes lags of the independent variable in the model.

18
New cards

Suppose you have the following two-variable time series regression model: 𝑌𝑡 = 𝛽0 + 𝛽1𝑋𝑡 + 𝜖𝑡 where 𝑌𝑡 is the dependent variable, 𝑋𝑡 is the explanatory variable, and 𝜖𝜖𝑡𝑡 is the error term. If you know that the error term 𝜖𝑡 in time t is correlated with the past value of the explanatory variable 𝑋𝑡−1, what potential issue might you be facing with the error term?

If the error term is correlated with the past value of the explanatory variable, this indicates a potential issue of omitted variable bias or autocorrelation in the error term.

19
New cards

How would you modify the model to account for this relationship?

To address this issue, you could modify the model to include the lagged value of the explanatory variable as an additional regressor. This approach helps to capture the effect on Xt-1 on Yt and adjust for the correlation between εt and Xt-1

20
New cards

Given the equation 𝑦𝑡 = 𝛽0 + 𝛽1𝑧𝑡 + 𝛽2𝑧𝑡−1 + 𝛽3𝑧𝑡−2 + 𝛽4𝑧𝑡−3 + 𝑢𝑡, how should 𝛽1 be interpreted?

The coefficient β1 represents the immediate impact of a one-unit change in the independent variable Zt on the dependent variable Yt, holding all other factors cosntant.

21
New cards

Using a variable such as the monthly sales of a commodity, intuitively explain what serial correlation is in the context of time series data.

Serial correlation, also known as autocorrelation, in the context of time series data refers to the relationship between a variable’s current value and its past values. Using the example fo monthly sales of a commodity: Past values influence future values.

22
New cards

Assuming the values for β1, β2, β3, and β4 are as follows: β1=0.5, β2=0.3, β3=0.2, and β4=0.1, calculate the long-run propensity (LRP) and provide its interpretation.

The long-run propensity (LRP) is 1.1. It indicates that for every one-unit increase in the independent variable 𝑧t, the dependent variable 𝑦𝑦 is expected to increase by 1.1 units in the long run. This reflects the cumulative effect of the current value of 𝑧t and its effects from the past three periods on 𝑦, demonstrating the i

23
New cards

Given the equation 𝑦�𝑡 = 2.5 + 0.5𝑧𝑡 + 0.3𝑧𝑡−1 + 0.2𝑧𝑡−2 + 0.1𝑧𝑡−3, illustrate the lag distribution and analyze how this reflects the largest effect of 𝑧 on 𝑦 as well as its influence over time

The largest immediate effect of 𝑧𝑧 on 𝑦𝑦 occurs with the current value 𝑧, which has the highest coefficient of 0.5. The contributions from the lagged values decrease progressively: 𝑧 −1 has a contribution of 0.3, 𝑧 −2 contributes 0.2, and 𝑧 −3 contributes 0.1.

24
New cards

What is the difference between impact propensity and long-run propensity in time series analysis?

Impact propensity refers to the immediate effect of a change in an independent variable on a dependent variable within a time series framework. It focuses on how much the dependent variable will change as a result of a one-unit change in the independent variable at a specific point in time. Long-run propensity, on the other hand, refers to the cumulative effect of changes in the independent variable on the dependent variable over time. It considers both immediate and delayed impacts (lags) of the independent variable on the dependent variable.

25
New cards

What are the six key assumptions made about Ordinary Least Squares (OLS) regression in time series analysis when working with finite samples?

Linearity: The relationship between the dependent variable and the independent variables is linear. No serial correlation (No autocorrelation): The error terms are independent of each other. Homoscedasticity: The variance of the error terms is constant across all levels of the independent variables. Normality of Errors: The error terms are normally distributed. No Perfect Multicollinearity: There is no perfect linear relationship among the independent variables. Exogeneity: The independent variables are not correlated with the error term

26
New cards

Given the equation log(𝑦𝑡) = 1.5 + 2.5 log(𝑧𝑡) + 1.3 log(𝑧𝑡−1) + 0.5 log(𝑧𝑡−2) − 0.7log (𝑧𝑡−3) + 𝑢𝑡, how do we differentiate between short-run and long-run elasticities?.

Short-run elasticity refers to the immediate effect of a change in the independent variable z on the dependent variable y at a specific time t. In this equation, the coefficient of log(𝑧𝑧 ) (which is 2.5) represents the short-run elasticity of y with respect to z. This means that a 1% increase in 𝑧𝑧 will result in a 2.5% increase in 𝑦𝑦 in the short run. Long-run elasticity captures the cumulative effects of changes in z over multiple time periods on y. In this model, long-run elasticity can be calculated by summing the coefficients of the current and lagged values of z: Long-Run Elasticity=2.5+1.3+0.5-0.7=3.6 This implies that a 1% increase in z (considering its current value and the impacts from the previous three periods) leads to a 3.6% increase in y in the long run.