Quantitative Methods (Full)

0.0(0)

Studied by 1 person

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/143

There's no tags or description

Looks like no tags are added yet.

Last updated 11:18 AM on 4/11/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

144 Terms

New cards

Why are linear models not ideal for use in economic situations?

Expect serial correlation from the residual values

ideally residual terms are unpredictable and uncorrelated

New cards

What do you do if the value being modelled grows exponentially instead of linearly?

Take natural log (and therefore exponent)

ln(y) = b + bt + e, so

y = e^(b + bt + e)

New cards

When would a linear model be appropriate, as opposed to a log-linear model?

When the growth is approximately constant

New cards

When would a log-linear model be appropriate, as opposed to a linear model?

When the growth is approximately linear

New cards

What are the three requirements for a time series to be covariance stationary?

Constant and finite:
- expected values in all periods
- variance in all periods
- covariance with lagged versions of the time series for all

New cards

What happens to time series without covariance stationarity?

Results are economically invalid
regression will lead to spurious results
Estimate of b will be biased
Hypothesis tests will be invalid

New cards

What is an autoregressive model?

Independent variables are historical values of the dependent variables

New cards

within AR, what does it mean for a model to be incomplete?

information within the data that the model is not capturing

New cards

How do you correct for an AR model with significant serial correlation (autocorrection) - (Incomplete model)

Increase number of lags until no significant autocorrect
Testing for autocorrection in an AR model
Test for autocorrelation with a t-test

Dubran-Watson Test doesn’t work for AR models (usually works for serial correlation)

New cards

When is a time series mean-reverting?

It falls when level above mean
It rises when level below mean

New cards

How does the formula change for regression for mean-reverting level?

xt = b0 + btxt

Would become:

xt = b0/ (1 - bt)

New cards

Can you have multiple Dependent or Independent variables?

Independent variables are those that can be manipulated, while dependent variables are influenced by changes in independent variables in a given study

New cards

What are the 5 assumptions of regression?

linearity - linear relationship
Homoskedasticity - Unchanging variance
Error Independence - Observations are independent
Normality - Residuals are normally distributed
Variable Independence - no exact linear relationships between two or more independent variables

New cards

What would be the 5 violations of regression?

Nonlinearity
Heteroskedasticity
Serial correlation or autocorrelation
non-normality
Multicollinearity

New cards

What does the error term represent?

The stochastic or random part of the model, capturing any unexplained variation in the dependent variable due to randomness, measurements errors, or unobserved factors

New cards

What do the independent variables represent?

The deterministic part of the model, quantifying the observed relationship between the independent variables and the dependent variable

New cards

What is the coefficient of determination and what does it do?

Also known as R-squared, measures goodness of fit of an estimated regression to the data. It can also be defined as the ration of the variation of the dependent variable explained by the independent variables.

New cards

Quick formula for R-Squared:

SSR/SST (think alphabet)

New cards

Do you want more or less variables for multiple linear regression?

Usually you want less, to avoid overfitting (Less is More). As you add more and more independent variables R-Squared will increase

New cards

Why is adjusted R² a little bit better than R²?

Doesn’t automatically go up with the addition of more independent variables

New cards

How do you determine what the addition of a new variable will have on Adjusted R-Squared?

if coefficients t-stat >|1.0|, then A.R² will go up

If coefficients t-stat <|1.0|, then A.R² will go down

New cards

What does a lower Akaike’s information criterion (AIC) indicate?

A lower AIC indicates a better fitting mode (you want it to be as low as possible to indicate a better model)

New cards

What does Bayesian Information Criteria (BIC) indicate?

A lower BIC indicates a better-fitting model

New cards

When do we prefer AIC to BIC or vice versa?

We use AIC when we are using a model for predications, we use BIC is all we’re interested in is the best goodness of fit

New cards

In terms of Adjusted R², which data would we want to use?

Ideally you want the highest Adjusted R² value, if just interpreting R² and Adjusted-R², this would change with AIC and BIC values given potentially

New cards

What is the Coefficient?

The slope of the independent variable, and it represents the expected change in the dependent variable for a 1 unit change in the independent variable (Holding all other variables constant - this is really key to remember)

New cards

What does a coefficient of 0 mean?

Independent variable has no significance, and probably can be excluded from the regression

New cards

What are the degrees of freedom?

for multiple regression: # of data points - # of regression coefficients

(n - (k + 1)

New cards

What is the really key thing to remember about the coefficient of independent variables?

This is based on the change that would occur for a 1 unit change HOLDING ALL OTHER VARIABLES CONSTANT

New cards

How do we interpret the hypothesis test and rejecting/ not rejecting the null hypothesis?

if the calculated t-statistic > t-critical value, we can reject the null hypothesis, if the calc. t-stat < t-crit, we cannot reject the null hypothesis

New cards

What is an unrestricted model?

A model that includes ALL the variables in the initial specification

New cards

What is a restricted (or nested) model?

Restricts the slope to 0, for one or more independent variables - not all of them are used. It is nested in the unrestricted model

New cards

What is the criteria for the F-test for joint test of slope coefficients?

Exceeds the critical F-value for the selected significance level

New cards

What is Model Error

Error between a predicted value and the actual value for a dependent variable within the data set

New cards

What is sampling error?

errors created by forecasting independent variables for use in forecasting a dependent variable

New cards

What is a logistic regression (logt) model?

Represents the dependent variable as a natural logarithm of probability ratios (confiding results to a range between 0 and 1)

New cards

When should a logistic regression (logit) model be used?

When the dependent variable is discrete (i.e. not continuous)

New cards

What is the stochastic part of a model?

The error term

New cards

What is the next step after estimating the regression model?

Analyse scatterplots of variables and residuals

New cards

What is the next step after analysing the scatterplots of variables and residuals?

Seeing if the regression assumptions are satisfied

New cards

What is the next step after seeing if the regression assumptions are satisfied (and they are)

Checking if the goodness of fit is satisfactory/ significant

New cards

What is the next step after seeing if the regression assumptions are satisfied (and they are not)

adjust the model

New cards

What is the next step after checking if the goodness of fit is satisfactory/ significant? (and they are)

test with out of sample date

New cards

What is the next step after checking if the goodness of fit is satisfactory/ significant? (and they are not)

adjust the model

New cards

In terms of interpreting (scatterplot) relationships, do we want to have little or no correlation, negative correlation, or positive correlation?

We want to have little or no correlation because it suggests low multicollinearity of those variables, which is a desirable characteristic. This tells us that each variable provides unique information, leading to mode stable and reliable coefficient estimates and simplifies model interpretation and enhances performance by avoiding redundancy among predictors

New cards

Based on p-values, when would it be correct to reject the null hypothesis?

If the p-value for the independent variable is < less than the level of significance, but you should not reject the null if the p-value is greater than the level of significance

New cards

How many dummy variables should be used to incorporate qualitative independent variables into a regression model?

n - 1 dummy variables

New cards

If we had a concern that a model might have an artificially large R² and t-statistics that are understated, what regression assumption is likely violated?

Multicollinearity - standard errors for each coefficient become inflated which results in understated t-statistics, which in turn leads to coefficients being incorrectly classified as not statistically significant. It would also have inflated R² and F-statistic values, and seem to be a better fit than it actually is

New cards

When does multicollinearity occur?

when at least two independent variables are highly correlated

New cards

What does the standard error of the forecast do?

Quantifies uncertainty around the prediction, NOT improves the forecasting of the dependent variable

New cards

What is Model Specification

Set of variables included in the regression and the regression equations functional form

New cards

What does it mean to have a sound economic basis for your model?

Economic reasoning behind the choice of variables and their interactions

New cards

What does parsimony mean?

Less is more - each variable plays an essential role, additional variables don’t add spurious accuracy

New cards

What does good in-sample but bad out-of-sample performance mean?

This would be an example of overfitting: an overfit model explains the data used to fit in, but may not work well with data outside the set

New cards

What does appropriate functional form mean?

A model should incorporate non-linear forms, if appropriate

New cards

What is Homoskedasticity?

(The ONE you want) Constant variance and one assumption for valid regression

New cards

What is Heteroskedasticity?

(Not the one you want) Nonconstant variance and violates assumptions

New cards

What are the types of heteroskedasticity?

Unconditional (not a problem in linear regression) and Conditional (size of error terms is related to value of the independent variables, and is a problem in linear regression)

New cards

How do you detect Heteroskedasticity?

Breusch-Pagan (BP) test - one-tail chi-square test

New cards

What is positive serial correlation?

Residuals tend to go in groups which violates assumptions

New cards

What is negative serial correlation?

Residuals tend to bounce back and forth which violates assumption

New cards

Are coefficient estimates largely affected or unaffected for positive/ negative serial correlation?

unaffacted for positive, affected for negative

New cards

In terms of serial correlation, what does an F-stat that is too large indicate? too small?

Too large means positive, too small means negative

New cards

Are standard errors too high or too low for positive/ negative serial correlation?

too low for positive, too high for negative

New cards

Are there more Type I or Type II errors for positive/ negative serial correlation

More Type I in positive, More Type II in negative

New cards

Is False significance/ false insignificance associated with positive/ negative serial correlation

False insignificance is associated with negative serial correlation, false significance with positive serial correlation

New cards

How to test for serial correlation?

Durban-Watson Test and Breusch-Godfrey Test

New cards

Can serial correlation be eliminated?

No, serial correlation cannot be eliminated, the standard errors simply account for it

New cards

What is multicollinearity

Two or more independent variables are highly correlated with each other

New cards

What are the effects of multicollinearity

model estimates of dependent variable are unaffected
Standard errors of coefficients are too large: t-stat are too small

New cards

How do you detect multicollinearity?

Visually, it will look absolutely fine on the scatter plot.

However, a high R-squared and a significant F-stat with insignificant t-stats (very low) for all slope coefficients is evidence of multicollinearity, or

Multicollinearity may exist even when the F-stat is insignificant or t-statistics are significant

New cards

What the is Variance Inflation Factor (VIF)?

a VIF exists for each independent variable in a multiple regression:

VIF = 1/ 1 - R²

Each independent variable is regressed against the other independent variables.

VIF>5 warrants further investigation of the given independent variable

VIF>10 indicates serios multicollinearity requiring correction

New cards

How do you correct for multicollinearity?

Exclude on or more independent variables from the model until multicollinearity is no longer present, or
Use a different proxy for one of the variables
Increase the sample size

New cards

Where does serial correlation typically occur?

time-series data sets

New cards

What does the Breusch-Godfrey test check for?

Checks the regression for serial correlation

New cards

What does Variance Inflation Factor (VIF) test for?

Multicollinearity

New cards

What does the Breusch-Pagan test for?

Tests for conditional heteroskedasticity

New cards

What is a potential consequence of omitted variables?

Heteroskedasticity or serial correlation

New cards

What is a potential consequence of inappropriate variable form?

Heteroskedasticity

New cards

What is a potential consequence of inappropriate variable scaling?

Heteroskedasticity or Multicollinearity

New cards

What is a potential consequence of inappropriate data pooling?

heteroskedasticity or serial correlation

New cards

Can patterns in serially correlated residuals contain information that has the potential to be exploited?

yes

New cards

Does conditional or unconditional heteroskedasticity cause errors in statistical inference?

Conditional

New cards

What does good out-of-sample performance mean?

Model generalises well (low risk of overfitting or underfitting)

New cards

What are the examples of potentially influential data points? (not violations of assumptions)

High-leverage points
Outliers
Influential Observations

New cards

What is a high-leverage point?

An extreme value of an independent variable

New cards

What is an outlier?

An extreme value of a dependent variable

New cards

What is an influential observation?

An observation whose inclusion may significantly alter regression results

New cards

What is the Measure of Leverage?

Leverage measures the distance between the value of the i-th observation of that independent variable and the mean value of that variable across all n observations:

0 < Leverage < 1

New cards

How do you look for high-leverage position?

Measure of Leverage

New cards

What does a high measure of leverage mean? low?

The higher the leverage, the more distant the observation from the mean for the variable

New cards

How do we determine if a point has a high measure of leverage?

h > 3((k + 1) / n)

New cards

In what scenario may multicollinearity not be a major issue

If the goal of the analysis is to predict the dependent variable, rather than to understand the roles of the independent variables

New cards

What is my story prompt for remembering the regression process?

Eager Captains ESpecially Study Sailors Guarding The Buried Past

New cards

What is a studentized residual?

Quotient resulting from the division of a residual by an estimate of its s.d., a form of a students t-stat with the estimate of error varying between points

= e/s

New cards

What is Cook’s distance?

A measure of how much the estimate values of the regressed change if observation i is deleted from the sample

New cards

What does it say about the observation if Cook’s Distance (D) is > 0.5

May be influential and merits further investigation

New cards

What does it say about the observation if Cook’s Distance (D) is > 1.0

Highly likely to be an influential data point

New cards

What does it say about the observation if Cook’s Distance (D) is

> 2 x (k/m)^0.5

highly likely to be an influential data point

100

New cards

Does the measure of Leverage apply to Dependent or Independent variables?

Independent