1/74
These flashcards cover key concepts, definitions, and important terms related to linear regression analysis as discussed in the Business Analytics lecture.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Simple Linear Regression Model
A model that estimates how a dependent variable is related to a single independent variable.
Dependent Variable
The variable being predicted in a regression model.
Independent Variable
The variable used to predict the value of the dependent variable.
Multiple Linear Regression Model
A regression model that estimates the relationship between a dependent variable and two or more independent variables.
Least Squares Method
A statistical method used to determine the best-fitting line by minimizing the sum of squared residuals.
Coefficient of Determination (R²)
A statistical measure that explains the proportion of variance in the dependent variable that can be predicted from the independent variables.
Error Term
The difference between the observed value and the predicted value in a regression model.
Extrapolation
The act of predicting values outside the range of observed data, which can be risky.
Residuals
The errors made in estimating the value of the dependent variable for each observation.
Dummy Variables
Binary variables created to represent categorical variables in regression analysis.
Hypothesis Testing
A statistical method that uses sample data to evaluate a hypothesis about a population parameter.
Statistical Significance
A determination that an effect or relationship is likely not due to random chance.
ANOVA
Analysis of variance, a statistical technique used to determine if there are significant differences between the means of three or more groups.
Model Fitting
The process of adjusting a model so it best represents the observed data.
Curvilinear Relationship
A relationship between two variables that is not a straight line, often represented as a curve.
Interaction Term
A variable that represents the combined effect of two or more independent variables on the dependent variable.
Piecewise Linear Regression
A type of regression that allows for different linear relationships in different intervals of the independent variable.
Overfitting
A modeling error that occurs when a model is too complex and captures noise instead of the underlying pattern.
Confidence Interval
An estimated range of values that is likely to include an unknown population parameter.
Prediction Interval
An estimate of the range of possible values for a new observation based on the regression model.
Multicollinearity
A situation in which two or more independent variables in a regression model are highly correlated.
Standard Error
An estimate of the variability between the sample statistic and the population parameter.
Statistical Software
Programs used to perform statistical analyses and computations.
Regression Output
The results from a regression analysis, including coefficients, standard errors, and significance levels.
Model Selection Procedures
Methods used to determine which variables to include in a regression model.
Quadratic Regression Model
A regression model that includes the square of an independent variable to account for curvilinear relationships.
Inference
The process of drawing conclusions about a population based on sample data.
Big Data
Extremely large data sets that may be analyzed computationally to reveal patterns and trends.
Basic Regression Equation
An equation that expresses a dependent variable as a function of one or more independent variables.
Experimental Region
The range of values for independent variables in the data used to estimate the regression model.
Hypotheses
Statements that can be tested statistically in a regression analysis.
Sums of Squares
Measure of variation in data; includes total, regression, and error sums of squares.
Statistical Procedures
Step-by-step methodologies used to analyze data and draw conclusions.
Residual Sum of Squares (RSS)
The sum of the squares of residuals, used to measure the discrepancy between observed and predicted values.
Estimate of Variance
A calculation that assesses the variability of the dependent variable.
Normal Distribution
A probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence.
Explanation of Variability
The portion of variance in a dependent variable that can be explained by the independent variable(s) in a regression model.
Residual Plot
A graphical representation that shows the residuals on the Y-axis and fitted values on the X-axis.
Statistical Independence
The condition where the occurrence of one event does not affect the probability of another event.
Sampling Distribution
The probability distribution of a given statistic based on a random sample.
Quadratic Terms
Variables in a model that represent the square of independent variables, allowing for curvature in the regression line.
Variable Selection
The process of selecting the most important variables to include in a regression model.
Significant Level (α)
The threshold for determining if a result is statistically significant; commonly set at 0.05.
t Test
A statistical test used to compare the means of one or two groups to determine if they are significantly different.
F Test
A statistical test used to compare two variances to determine if they are significantly different.
Identification of Relationships
The process of discovering relationships between variables based on statistical analyses.
Explanatory Variable
Another term for an independent variable, used to explain variations in the dependent variable.
Statistical Relationship
A relationship that indicates a statistical correlation or association between variables.
Random Sample
A subset of individuals chosen from a larger set, ensuring each individual has an equal chance of being selected.
Numerical Methods
Algorithms that use numerical approximation for the solutions of mathematical problems.
Confidence Level
The probability that the value of a parameter falls within a specified range of values.
Sampling Error
The difference between the sample statistic and the actual population parameter.
Regression Coefficient
A measure of the change in the dependent variable associated with a one-unit change in an independent variable.
Data Exploration
The process of analyzing data sets to summarize their main characteristics, often using visual methods.
Statistical Analysis
The science of collecting and interpreting data through various mathematical techniques.
Independent Variable Assessment
The evaluation of the effects that different independent variables have on the dependent variable.
Regression Diagnostics
Procedures used to check the validity of regression models.
Statistical Theory
The framework consisting of principles and mathematical foundations for statistical practice.
Computational Statistics
A basic branch of statistics that focuses on algorithms that facilitate statistical analysis.
Resistance to Outliers
The ability of a statistical measure or method to not be unduly affected by extreme values.
Error Variance
The variability in the dependent variable that cannot be explained by the independent variable(s) in the model.
Adjusted R²
A modified version of R² that has been adjusted for the number of predictors in the model.
Observed Frequency
The count of how many times an outcome is observed in the sample.
Bivariate Analysis
An analysis of two variables to understand their relationship.
Statistical Model
A mathematical representation of observed data and the relationships between the variables.
Characteristic Curve
A plot that visualizes the relationship between two variables in regression analysis.
Model Validation
The process of checking if the statistical model is applicable to the data.
Random Assignment
Assigning participants to groups by chance, to ensure each participant has an equal chance of being included.
Error Analysis
The study of the types and causes of errors in analytical datasets.
Regression Assumptions
The basic requirements that a regression model must satisfy for valid results.
Non-linear Regression
A type of regression analysis where the relationship between the variables is modeled as a non-linear equation.
Dependent Observations
Measurements of the same subject taken over time; these are not independent of one another.
Testing Variability
Investigating the extent to which measures of data vary.
Null Hypothesis
The hypothesis that there is no effect or no difference, used as a default or starting assumption.
Statistical Reporting
Presenting the results of statistical analyses in a clear and interpretable format.