Foundations of Machine Learning

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/83

Earn XP

Description and Tags

Flashcards covering key terms and definitions from the foundations of machine learning concepts and techniques.

Last updated 6:40 PM on 4/22/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

84 Terms

New cards

Machine Learning

Using algorithms to learn patterns from data and make predictions.

New cards

Supervised Learning

Learning using labeled data (input + known output).

New cards

Unsupervised Learning

Learning patterns from unlabeled data.

New cards

Classification

Predicting categories, such as spam vs not spam.

New cards

Regression

Predicting continuous values like height or price.

New cards

Feature Matrix (X)

Input variables used to make predictions.

New cards

Target Vector (y)

Output variable being predicted.

New cards

Linear Regression

Models relationship between variables using a straight line.

New cards

Least Squares Method

Minimizes squared error between predicted and actual values.

New cards

Error (Residual)

Difference between actual and predicted value.

New cards

Mean Squared Error (MSE)

Average of squared errors.

New cards

Root Mean Squared Error (RMSE)

Square root of MSE; average prediction error.

New cards

R² (R-Squared)

Percentage of variability explained by the model.

New cards

Training Set

Data used to build the model.

New cards

Validation Set

Data used to tune and evaluate during training.

New cards

Test Set

Final dataset to evaluate model performance.

New cards

Overfitting

Model performs well on training data but poorly on new data.

New cards

Cross-Validation (CV)

Repeatedly splitting data into folds to evaluate model performance.

New cards

K-Fold Cross Validation

Splitting data into k groups and rotating validation sets.

New cards

Missing Data

Data points with no value (NaN) that must be handled.

New cards

Imputation

Filling missing values using mean, median, or most frequent value.

New cards

Mean Imputation

Replace missing values with average.

New cards

Median Imputation

Replace missing values with median (better for skewed data).

New cards

Standardization (Z-score)

Scale data to mean = 0 and SD = 1.

New cards

Normalization (Min-Max Scaling)

Scale values between 0 and 1.

New cards

Feature Scaling

Ensuring all features are on a similar scale to improve models.

New cards

K-Nearest Neighbors (KNN)

Classifies data based on nearest neighbors.

New cards

K Value

Number of neighbors used for classification.

New cards

Distance (Norm)

Measure of similarity between data points.

New cards

Curse of Dimensionality

Performance decreases as features increase.

New cards

Naive Bayes

Probabilistic classifier using Bayes’ theorem.

New cards

Bayes Theorem

Calculates probability of a class given features.

New cards

Prior Probability

Initial probability of a class.

New cards

Likelihood

Probability of features given class.

New cards

Posterior

Final probability after considering evidence.

New cards

Naive Assumption

Features are independent given class.

New cards

Joint Probability

Probability of multiple events occurring together.

New cards

Conditional Probability

Probability of one event given another.

New cards

Support Vector Machines (SVM)

Finds optimal boundary (hyperplane) separating classes.

New cards

Hyperplane

Decision boundary.

New cards

Margin

Distance between boundary and closest data points.

New cards

Support Vectors

Points closest to boundary.

New cards

Hard Margin

No misclassification allowed.

New cards

Soft Margin

Allows some errors.

New cards

Kernel

Function defining shape of decision boundary.

New cards

RBF Kernel

Nonlinear boundary (curved separation).

New cards

Regularization Parameter (C)

Controls tradeoff between accuracy and generalization.

New cards

Decision Tree

Model that splits data based on features.

New cards

Entropy

Measure of randomness/impurity.

New cards

Information Gain

Reduction in entropy after a split.

New cards

Gini Index

Measure of impurity used in trees.

New cards

Bagging (Bootstrap Aggregation)

Combine multiple models trained on random samples.

New cards

Random Forest

Ensemble of decision trees.

New cards

Boosting

Sequentially improving weak models.

New cards

AdaBoost

Adjusts weights to focus on errors.

New cards

Gradient Boosting

Builds models based on previous errors.

New cards

Learning Rate

Controls how much each new model learns.

New cards

Hyperparameter

Parameter set before training that controls model behavior.

New cards

Grid Search

Testing multiple hyperparameter values to find the best one.

New cards

Regularization

Prevent overfitting by penalizing large coefficients.

New cards

Ridge Regression (L2)

Shrinks coefficients evenly.

New cards

Lasso Regression (L1)

Can reduce coefficients to zero.

New cards

Time Series Data

Data ordered over time.

New cards

Trend

Long-term direction.

New cards

Seasonality

Repeating pattern.

New cards

Noise

Random variation.

New cards

Lag Feature

Previous value used for prediction.

New cards

Moving Average

Average of past values to smooth data.

New cards

Autocorrelation

Correlation of a variable with its past values.

New cards

Stationarity

Statistical properties remain constant over time.

New cards

Differencing

Subtracting previous values to remove trend.

New cards

Autoregressive Model (AR)

Uses past values to predict future.

New cards

Moving Average Model (MA)

Uses past errors to predict.

New cards

ARIMA Model

Combines AR + I (integration) + MA.

New cards

Ontologies

Structured representation of knowledge.

New cards

Ontology Evaluation

Assessing quality of ontology.

New cards

Accuracy (Ontology)

Correctness of representation.

New cards

Consistency

No contradictions in ontology.

New cards

Completeness

Covers domain fully.

New cards

Clarity

Easy to understand.

New cards

Adaptability

Can be extended.

New cards

Semantic Web

Web of linked structured data.

New cards

Web Ontology Language (OWL)

A formal language for representing ontologies.

New cards

SKOS

Vocabulary system for concepts.