1/206
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Machine Learning (ML)
Learn from data
ACM
Association for Computing Machinery
ACM Turing Award
Equivalent to the “Nobel Prize of Computing”
Alan Mathison Turing (1912–1954)
Father of theoretical computer science and artificial intelligence who articulated the mathematical foundation and limits of computing
Supervised Learning
A type of machine learning where both input and output are given (labeled data)
Unsupervised Learning
A type of machine learning where only input is given and no labels (unlabeled data)
Support Vector Machine (SVM)
A supervised learning model that classifies data by finding the optimal hyperplane
K-means Clustering
An unsupervised learning algorithm that groups data into clusters based on similarity
Feature
An independent variable
Observation (Row)
An example or data point in the dataset
Accuracy
(TP+TN)/(TP+TN+FP+FN) – proportion of correct predictions
Precision
TP/(TP+FP) – proportion of predicted positives that are truly positive
Recall (Sensitivity/TPR)
TP/(TP+FN) – proportion of actual positives correctly predicted
Specificity
TN/(TN+FP) – proportion of actual negatives correctly predicted
F1 Score
2 * (Precision * Recall) / (Precision + Recall) – harmonic mean of precision and recall
Confusion Matrix
A table showing counts of TP
True Positive (TP)
Correctly predicted positive cases
False Positive (FP)
Incorrectly predicted as positive
False Negative (FN)
Incorrectly predicted as negative
True Negative (TN)
Correctly predicted negative cases
ROC Curve
A graphical plot of True Positive Rate (TPR) vs. False Positive Rate (FPR) at various thresholds
AUC (Area Under Curve)
A measure of the overall performance of a classifier across thresholds
Cross-validation
A resampling method (e.g., 5-fold) to evaluate model performance on different data splits
LASSO (Least Absolute Shrinkage and Selection Operator)
A regression-based method for feature selection
Concrete Autoencoder (CAE)
A deep learning method for feature selection using neural networks
Feature Extraction
Process of creating a new, smaller set of features that captures the most useful information from the original data
Principal Component Analysis (PCA)
A dimensionality reduction method for feature extraction
Autoencoder (AE)
A neural network that compresses and reconstructs data for feature extraction
Pan-cancer Analysis
Studying multiple cancer types to find common biomarkers using machine learning
Stochastic Behavior of CAE
The variability in results due to randomness in training runs of Concrete Autoencoder
Feature Selection
Process of filtering out irrelevant or redundant features while keeping a subset of the original features
Unstructured Data
Data without a predefined structure (e.g., text data), often converted into structured form like Bag-of-Words
Binary Classification (𝑦 ∈ {−1, 1} or {0, 1})
Task of predicting one of two classes
t-SNE Hyperparameter: Perplexity
, Effective number of nearest neighbors (recommended 5–50, must be < number of samples)
Scatter Plot
A plot showing relationships between two variables (e.g., Iris dataset)
Box Plot
Visualization showing distributions with quartiles and medians (five-number summary: min, Q1, median, Q3, max)
Classification
Predicting categorical labels (e.g., cancer type) using input features