1/31
These vocabulary flashcards cover random vectors and matrices, simple and multiple linear regression, goodness of fit measures, regression assumptions, and statistical inference as presented in the MAT 3522 lecture notes.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Random Vector
A vector whose components are random variables, such as X=(X1X2).
Mean Vector
For a random vector X, the mean vector is defined as E(X)=μ=E(X1)E(X2)⋮E(Xp).
Covariance
A measure of the direction of the linear relationship between two random variables X and Y, defined by Cov(X,Y)=E[(X−μX)(Y−μY)] or E(XY)−E(X)E(Y).
Positive Covariance (Cov(X,Y)>0)
Indicates that the variables X and Y tend to increase together.
Negative Covariance (Cov(X,Y)<0)
Indicates that one variable tends to increase while the other variable decreases.
Zero Covariance (Cov(X,Y)=0)
Indicates that there is no linear relationship between the two variables.
Variance-Covariance Matrix (Σ)
A symmetric matrix where the diagonal elements are variances (σii=Var(Xi)) and off-diagonal elements are covariances (σij=Cov(Xi,Xj)).
Positive Semi-definite
A mathematical property that all covariance matrices possess.
Simple Linear Regression
A statistical technique used to model the relationship between one response variable and one explanatory variable using a straight-line population model: Yi=β0+β1Xi+εi.
Intercept (β0)
The predicted value of the response variable Y when the explanatory variable X is zero.
Slope (β1)
The amount the predicted value of Y increases for every one-unit increase in the explanatory variable X.
Estimated Regression Equation
The fitted model using sample data given by Y^=b0+b1X, where b0 and b1 are the estimated intercept and slope.
Homoscedasticity
The assumption for regression that the variance of the error terms remains constant, mathematically expressed as Var(εi)=σ2.
Least Squares Estimation
A method of estimating regression coefficients by minimizing the sum of squared residuals (SSE=∑i=1n(Yi−Y^i)2).
Residual (ei)
The difference between the observed value and the predicted value, defined as ei=Yi−Y^i. It measures the prediction error.
Multiple Linear Regression
A statistical technique used to model the relationship between one response variable and two or more explanatory variables using the model Yi=β0+β1X1i+β2X2i+⋯+βkXki+εi.
Perfect Multicollinearity
A situation where explanatory variables are perfectly linearly related (e.g., X2=2X1), making it impossible to uniquely estimate regression coefficients.
Design Matrix (X)
In the matrix form Y=Xβ+ε, a matrix where the first column is ones for the intercept term and the remaining columns contain the explanatory variable data.
Least Squares Estimator (Matrix Form)
The formula used to estimate all regression coefficients simultaneously: β^=(X′X)−1X′Y.
Goodness of Fit
The extent to which the fitted regression equation adequately describes the relationship between the response and explanatory variables.
Total Sum of Squares (SST)
Measures the total variation present in the response variable, defined as SST=∑i=1n(Yi−Yˉ)2.
Regression Sum of Squares (SSR)
Measures the variation in the response variable that is explained by the regression model, defined as SSR=∑i=1n(Y^i−Yˉ)2.
Error Sum of Squares (SSE)
Measures variation not explained by the regression model, defined as SSE=∑i=1n(Yi−Y^i)2 or ∑i=1nei2.
Coefficient of Determination (R2)
The proportion of variation in the response variable explained by the regression model, calculated as R2=SSTSSR.
Adjusted Coefficient of Determination (Radj2)
A measure of model performance that rewards useful variables and penalizes unnecessary variables, defined as Radj2=1−(SST/(n−1)SSE/(n−k−1)).
Residual Plot
A graph of residuals (ei) against fitted values (Y^i) used as a diagnostic tool for checking regression assumptions.
Heteroscedasticity
A pattern in a residual plot exhibiting a funnel shape, indicating that the constant variance assumption has been violated.
t-Test for Slope
A hypothesis test where H0:β1=0 against H1:β1=0 to determine if an explanatory variable significantly affects the response.
Mean Square Error (MSE)
An estimator of error variance (σ2), calculated as MSE=n−2SSE in simple regression or MSE=n−k−1SSE in multiple regression.
Overall F-Test
A test of the significance of the entire regression model where H0:β1=β2=⋯=βk=0 against H1:At least one βj=0, using the statistic F=MSEMSR.
Confidence Interval
A range of plausible values for an unknown population parameter, constructed using the general form: Estimate±(CriticalValue)×(StandardError).
Partial Effect
An interpretation of a regression coefficient in multiple regression representing the effect of one variable while holding all other variables constant.