1/19
This set of vocabulary flashcards covers core concepts of simple linear regression, including hypothesis testing, model parameters, assumptions, and diagnostics as presented by Dr. Jinkai Xue.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Null hypothesis (H0)
The status-quo claim about a parameter ϑ (e.g., ϑ=ϑ0).
Alternative hypothesis (Ha)
What we’d conclude if we reject the null hypothesis (H0).
p-value
The probability—assuming H0 is true—of seeing a test-statistic value at least as extreme as the one we computed.
Confidence intervals (CIs)
Two numbers (L,U) such that P(L≤ϑ≤U)=1−ϖ.
Dependent variable (y)
Also known as the response variable, it is the outcome being predicted in regression analysis.
Independent variable (x)
Also known as the predictor variable, it is used in regression to explain or predict changes in the dependent variable.
Simple Linear Regression Model
A probabilistic model expressed as y=ω0+ω1x+ϱ, where the deterministic part E[y]=ω0+ω1x is the line of means.
ω0 (Theta-zero)
The true population parameter representing the y-intercept (the value of E[y] when x=0).
ω1 (Theta-one)
The true population parameter representing the slope (the change in E[y] per unit change in x).
ω^0 and ω^1
The sample estimates of the population parameters computed from n data points by the method of least squares.
Residual (ϱ^i)
The observed counterpart of the unobserved random error (ϱi), calculated as the difference between the observed value and the fitted value: ϱ^i=yi−y^i.
Homoscedasticity
The model assumption that the variance of the error term, Var(ϱ)=ε2, is the same for every value of x.
Method of Least Squares
A mathematical procedure that picks the line minimizing the sum of squared vertical residuals: SSE=∑i=1n(yi−y^i)2.
Residual Variance (s2)
An unbiased estimator of ε2 calculated by dividing the sum of squared errors (SSE) by the degrees of freedom (n−2).
Standard Error of the Slope (sω^1)
A value that shrinks when the residual standard deviation s is small or when the x values are widely spread out, calculated as s/Sxx.
Pearson Correlation (r)
A scale-free measure of linear association ranging from −1 to 1; its sign matches the estimated slope ω^1.
Coefficient of Determination (r2)
The fraction of the total variation in y that is explained by the linear relationship with x, ranging from 0 to 1.
Prediction Interval
An interval calculated for a new individual observation of y (ynew) that is always wider than the confidence interval for the mean response.
Extrapolation
The pitfall of using a fitted line to make predictions outside the sampled range of x; such predictions are unsupported by the model.
Anscombe’s Quartet
A collection of four data sets with identical summary statistics (ω^0, ω^1, and r2) but completely different distributions, demonstrating the necessity of plotting data.