1/17
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Model Formulation
Intuition: How do different variables relate to an outcome
Definition: The core assumption is a linear relationship between a dependent variable Y and one or more independent variables X_i
Components:
Y: Dependent variable
X_i: Independent variables (predictors)
B_0, B_i, e: Intercept, coefficients, error term (residual
Used: The basis for models like CAPM, factor models, and many trading strategies
Ordinary Least Squares (OLS) Estimation: Matrix Form
Intuition: Finds the smallest error term by trying different coefficients (B)
Definition: Finds the coefficients B that minimize the Residual Sum of Squares
Components:
Data matrix X
Response vector y
Done using transformations and inverses
Uses: B is needed in linear regression model
OLS Assumption: Linearity
Intuition: 1:1 movement in coefficient
Definition: The model is linear in the parameters B
Components:
Coefficients B
Not observable random variable \epsilon
Uses: Make sure we don’t have a non-linear relationship
OLS Assumption: No multicollinearity
Intuition: There isn’t a strong relationship between any predictors
Definition: X^TX is invertible. There is no perfect linear relationship between predictors.
Components:
Data matrix X
Data matrix X transposed (^T)
Matrix multiplication between the two is invertible
Uses: Prevents inflated standard errors and unstable coefficient estimates
OLS Assumption: Homoscedasticity
Intuition: The standard error value doesn’t change if we use different data
Definition: The error variance is constant across all observations
Components:
Error term for each X (take the variance error term given X)
Ensure (1) equals variance (\sigma^2 )
Uses: High-return periods that also have high volatility (OLS is already unbiased)
OLS Assumption: No autocorrelation
Intuition: Error terms aren’t related to each other at all
Definition: Errors are uncorrelated across observations
Components:
Covariance between two error terms given X
(1) is equal to zero
The two error terms aren’t the same
Uses: Common in time series data (e.g., momentum strategies)
OLS Assumption: Maximum Likelihood Estimator
Intuition: The OLS estimator could be the MLE depending if the errors are a bell curve around 0
Definition: If we add the assumption that \epsilon ~ N(0, \sigma^2 ), the OLS estimator is also the MLE
Components:
First five assumptions (Linearity, exogenous, no multicollinearity, homoscedasticity, no autocorrelation)
Error term is normally distributed
Uses: Know which B to use, t-test and F-statistic can only be carried out if this is true
OLS Assumption: Strictly Exogenous
Intuition: Error term always has expected value of 0 no matter the value of the independent variables
Definition: The error term is uncorrelated with the predictors
Components:
The expectation of each error term given the data set X is equal to zero
Uses: Crucial violation in finance, can lead to biased estimators
R^2
Intuition: How much variance in the result is explained by the data matrix
Definition: Proportion of the variance in Y that is predictable from X
Components:
1 - (RSS / TSS)
RSS: Residual Dum of Squares (variation in error between observed data and modeled values)
TSS: Total Sum of Squares (variation in the observed data)
Uses: Compare models with same number of predictors
Adjusted R^2
Intuition: How much variance in the result is explained by the data matrix, prioritized models with less irrelevant predictors
Definition: Proportion of the variance in Y that is predictable from X
Components:
1 - ( (RSS / (m - p - 1)) / ( TSS / (m - 1) ) )
RSS and TSS
Number of predictors (p)
Uses: Compare models with different numbers of predictors, lower is better
Standard Error (SE) of \beta_{i}
Intuition: How far the beta can change the predicted values from being the actual true value
Definition: Estimated standard deviation of a parameter estimate
Components:
Square root of variance of coefficient
Uses: Construct confidence intervals and perform hypothesis tests on individual coefficients
t-statistic \beta
Intuition: check to make sure all coefficients are a good fit (not a zero relationship)
Definition: the ratio of the difference in a number’s (coefficient’s) estimated value from its assumed value (0) to its standard error
Components:
t = (\beta / SE(\beta))
Uses: test null hypothesis H0: \beta_{i} = 0. Follows a t-distribution with m-p-1 degrees of freedom
F-statistic
Intuition: Does regression model explain a meaningful amount of variation in the dependent variable compared to noise
Definition: Ratio that compares explained variance per parameter to unexplained variance per remaining degree of freedom
Components:
Numerator: How large the sum of squared residuals becomes in %
Denominator: Accounts for sampling variability
Uses: tests the null hypothesis that all slope coefficients are jointly equal to zero
Ridge Regression
Intuition: Improves prediction by shrinking coefficient magnitudes to reduce variance at the cost of introductions some bias
Definition: Regularized linear regression that minimizes squared errors plus an L2 penalty on the coefficients
Components:
Loss function: RSS measuring fit to the data
L2 penalty: Squared magnitude of coefficients that discourages large weights
Regularization parameter (lambda): Controls strength of coefficient shrinkage
Uses: Handles multicollinearity and improve out of sample performance in high dimensional regressions
Lasso Regression
Intuition: performs both shrinkage and variable selection by forcing some coefficients exactly to zero
Definition: regularized linear regression that minimizes squared errors plus an L1 penalty on the coefficients
Components:
Loss function: residual sums of squared capturing model fit
L1 penalty: Absolute values of coefficients that promote sparsity
Regularization parameter (lambda): Determines shrinkage and variable elimination
Uses: Feature selection and choosing when predictors may be irrelevant
Bias
Intuition: Error from approximating a real-world function with a simpler model
Definition: Error when the expected value of an estimator does not equal true parameter value
Components:
True parameter: The actual coefficient values generating the data
Estimator expectation: Average value of the estimated coefficients across samples
Model constraints: Assumptions or regularization that distort the estimator toward simpler models
Uses: Understand the bias-variance trade off and to justify regularization methods like ridge and lasso
Variance
Intuition: Error from model being too sensitive to training data
Definition: Expected squared deviation of a model’s prediction from its own average prediction across different training datasets
Components:
Training sample randomness: Different datasets drawn from the same process lead to different fitted models.
Estimator instability: Sensitivity of coefficients or prediction to changes in the data
Uses: Understand overfitting risk and to motivate regularization methods that stabilize model estimates
Bias-Variance Tradeoff
Intuition: finding the middle ground of complex or simple a model should be
Definition: finding the optimal balance between complex models
Components:
Complex model (high degree polynomial): low bias but high variance (overfitting)
Simpler model (OLS): high bias but low variance (underfitting)
Uses: Obtain the least amount of prediction error