1/79
This collection of flashcards covers key vocabulary and concepts related to Linear Regression Analysis, including model structures, statistical tests, and measures of goodness-of-fit.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Population Regression Model
A regression model that represents the relationship between a dependent variable and independent variables in a population.
Marginal Effect
The change in the dependent variable resulting from a one-unit increase in an independent variable while holding other independent variables constant.
Sample Regression Function (SRF)
The predicted relationship used to estimate the dependent variable from independent variables based on a sample.
Sample Residual
The difference between the actual observed value and the predicted value from the sample regression function.
Minimizing Sum of Squared Residuals
The process in regression analysis to estimate parameters by minimizing the total of the squared differences between the observed and predicted values.
Goodness-of-Fit
A measure of how well the regression model predicts actual data.
Coefficient of Determination (R-squared)
A statistical measure that represents the proportion of the variance for the dependent variable that is explained by independent variables.
Adjusted R-squared
A modified version of R-squared that adjusts for the number of predictors in a model.
Unexplained Sum of Squares (USS)
The portion of the total variability in the observed data that is not explained by the independent variables.
Total Sum of Squares (TSS)
The total variance in the dependent variable; the sum of explained and unexplained variances.
Statistical Hypothesis Test
A method for testing a hypothesis about a population parameter based on sample data.
Null Hypothesis (H0)
A statement indicating no effect or no difference, used as a starting point for statistical testing.
Alternative Hypothesis (H1)
The hypothesis that indicates the presence of an effect or difference.
Level of Significance
The probability of rejecting the null hypothesis when it is true, commonly denoted as alpha (α).
Rejection Rule
The criteria set for rejecting the null hypothesis based on the statistical test results.
F-Test
A statistical test used to assess the overall significance of a linear regression model.
F-Statistic
A ratio used in the F-Test to determine whether the variability explained by the model is significantly greater than unexplained variability.
Right-tailed Test
A statistical test where the critical region is on the right side of the distribution.
F-Distribution
The probability distribution of the F-statistic under the null hypothesis.
Critical Value
A threshold value that the test statistic must exceed to reject the null hypothesis.
t-test
A statistical test used to determine if there is a significant difference between the means of two groups.
Exclusion Restriction Test
A test determining whether specific independent variables can be omitted from the regression model without significantly affecting its explanatory power.
Unrestricted Regression Model
A regression model that includes all possible predictors without any constraints.
Restricted Regression Model
A regression model that imposes restrictions on certain coefficients, usually setting them to zero.
Structural Differences Test (Chow Test)
A statistical test to determine if the regression coefficients differ across two or more groups.
Pooled Regression
A regression model that combines data from different groups to evaluate overall relationships.
Separate Regressions
Separate linear regression models run on different subsets of the data.
Unexplained Sum of Squares (USS) Comparison
The process of comparing the USS from different regression models to assess structural differences.
Coefficient Equality Hypothesis
The hypothesis that regression coefficients are equal across different models or groups.
R-squared Formula
R² = Explained Variation/Total Variation in y.
Explanatory Variable
The independent variable(s) in regression that are used to predict the value of the dependent variable.
Dependent Variable
The outcome variable that is being predicted in regression analysis.
Independent Variable
The variable that is manipulated or changed to observe its effect on the dependent variable.
Parameter Estimation
The process of estimating the values of parameters in a regression model.
Variance
A measure of the dispersion of a set of values.
Statistical Significance
A determination that a relationship or effect is unlikely to be due to chance.
P-value
The probability of obtaining a statistic as extreme as the test statistic, assuming the null hypothesis is true.
Type I Error
The error of rejecting a true null hypothesis (false positive).
Type II Error
The error of failing to reject a false null hypothesis (false negative).
Model Fit
How well a statistical model encompasses the data it is supposed to represent.
Residual Analysis
An examination of the residuals to assess the fit of a regression model.
Regression Coefficient
A value that represents the relationship between a given independent variable and the dependent variable.
Standard Error of Estimate
An indication of the accuracy of predictions made with a regression analysis.
Likelihood Ratio Test
A statistical test used to compare the goodness of fit of two models.
Multicollinearity
A situation in statistical models where two or more independent variables are highly correlated.
Heteroscedasticity
A condition in regression analysis where the variance of errors varies across observations.
Linearity Assumption
The assumption that the relationship between independent and dependent variables is linear.
Independence of Errors
The assumption that the residuals are statistically independent.
Normality of Errors
The assumption that the residuals are normally distributed.
Fisher's F-Test
A test to determine if there are significant differences between the variances of populations.
R-squared Adjustments
Adjustments to R-squared that account for the number of predictors in the model.
Parameter Significance Testing
The process of testing whether individual regression coefficients are significantly different from zero.
Model Specification
The process of selecting the correct variables and functional form for a regression model.
Residuals
The differences between observed and predicted values in a regression model.
Total Variation
The overall variability in a dataset.
Explained Variation
The portion of the total variation that is accounted for by the model.
Dummy Variables
Binary variables used to represent categories in regression analysis.
Endogeneity
A situation in which an explanatory variable is correlated with the error term.
Outlier
An observation that lies an abnormal distance from other values in a dataset.
Influential Observation
An observation that significantly affects the estimate of the regression coefficients.
Statistical Software
Programs used to perform statistical analysis and regression analysis.
Bootstrap Methods
A resampling technique used to estimate the distribution of a statistic.
Panel Data
Data that combines cross-sectional and time-series data.
Time-Series Analysis
Analysis of data points collected or recorded at specific time intervals.
Cross-Sectional Data
Data collected at a single point in time across multiple subjects.
Regression Diagnostics
Methods for checking the validity and appropriateness of a regression model.
Sample Size Determination
The process of calculating the number of observations required for a study.
Statistical Power
The probability that a statistical test will correctly reject a false null hypothesis.
Covariance
A measure of how much two random variables change together.
Correlation Coefficient
A statistical measure that describes the strength and direction of a relationship between two variables.
Predictive Modeling
The process of creating a statistical model to predict future outcomes.
Model Validation
The process of assessing the performance of a model using new data.
Variance Inflation Factor (VIF)
A measure of how much the variance of a regression coefficient is inflated due to multicollinearity.
Principal Component Analysis (PCA)
A statistical technique used to reduce the dimensionality of data.
Chow Test
A test for structural breaks in regression analysis.
Residual plots
Graphs that plot residuals on the x-axis against predicted values or another variable on the y-axis.
Autocorrelation
The correlation of a signal with a delayed copy of itself.
Cross-Validation
A technique for assessing how the results of a statistical analysis will generalize to an independent dataset.
Machine Learning Regression
A subset of machine learning techniques focused on predicting numerical outcomes.
Data Transformation
The process of converting data from one format or structure into another to meet the assumptions of a model.