1/27
Comprehensive vocabulary flashcards covering Business Analytics concepts including regression, classification, model evaluation, ensemble methods, and artificial intelligence.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Regression
A predictive method used for numeric prediction where the dependent variable is a continuous variable.
Classification
A predictive method used for binary prediction where the dependent variable is a binary variable.
Holdout Method
A method to find the best model by splitting data into training and test sets, normally in a 70%−30% or 80%−20% ratio.
Cross Validation
A technique where data is split into equal-sized chunks (folds) to perform the holdout method multiple times and take the average of the results.
Dependent Variable
The outcome variable that the model is checking the relationship of and making predictions for.
Independent Variables
The variables used as predictors to estimate the value of or relationship with the dependent variable.
Simple Regression
A regression analysis using only one independent variable, represented by the formula y=β0+β1x+ϵ.
Multiple Regression
A regression analysis involving more than one independent variable, represented by the formula y=β0+β1x1+β2x2+⋯+ϵ.
Interaction Term
A term added to a regression model to capture the joint effect of two or more variables, such as β3x1×x2.
R Square (R2)
Also called the coefficient of determination, it represents the percentage of variation in the dependent variable explained by the independent variable.
Standard Error
A regression statistic similar to standard deviation that measures the error of a prediction.
Binary Variable
A variable that takes values of 1 or 0, often converted from a categorical variable and interpreted as a 'yes' or 'no' effect.
Extrapolation
A limitation of regression referring to the inability to accurately predict outcomes outside the range of variables used in fitting the model.
Logistic Regression
A regression technique for binary dependent variables that provides a probability as a prediction outcome; if the probability is above a threshold, the outcome is predicted as 1.
K-Nearest Neighbors (KNN)
A prediction method that uses the k nearest data points by calculating distance to determine an outcome.
Accuracy
A performance metric calculated as the number of correct classifications divided by the number of total classifications.
Sensitivity
A classification metric measuring how many times the model predicted 1 among all true 1s, calculated as TP+FNTP.
Specificity
A classification metric measuring how many times the model predicted 0 among all true 0s, calculated as TN+FPTN.
Supervised Data Mining
The process of exploring patterns in data specifically with a target variable, such as in regression models.
Decision Tree
A model that splits data into groups (Regression Tree for numeric outcomes or Classification Tree for binary outcomes) using nodes and edges.
Overfitting Prevention
Setting parameters such as Max_depth or Min_samples_split to ensure a decision tree does not grow infinitely and fit noise in the data.
Ensemble Methods
Techniques that combine multiple prediction models so that their individual mistakes cancel out, working best when models are different and errors are independent.
Bagging (Bootstrap Aggregation)
An ensemble method where multiple identical models are created using different random samples of data, and their predictions are averaged.
Random Forest
An extension of bagging that creates multiple models using different random samples of data and a random set of independent variables.
Boosting
An ensemble method that sequentially creates models to fix the mistakes of previous ones, adapting to points with large errors.
Artificial Intelligence
A system capable of simulating human intelligence and thought processes, including pattern recognition, computer vision, and machine learning.
Machine Learning
A subset of AI focused on making predictions by finding patterns in examples without being given specific human instructions.
Deep Learning
A powerful machine learning technique using neural networks with multiple hidden layers, particularly useful for unstructured data like images and text.