Machine Learning - Cap 5610 - Midterm

0.0(0)
studied byStudied by 27 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/206

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

207 Terms

1
New cards

Machine Learning (ML)

Learn from data

2
New cards

ACM

Association for Computing Machinery

3
New cards

ACM Turing Award

Equivalent to the “Nobel Prize of Computing”

4
New cards

Alan Mathison Turing (1912–1954)

Father of theoretical computer science and artificial intelligence who articulated the mathematical foundation and limits of computing

5
New cards

Supervised Learning

A type of machine learning where both input and output are given (labeled data)

6
New cards

Unsupervised Learning

A type of machine learning where only input is given and no labels (unlabeled data)

7
New cards

Support Vector Machine (SVM)

A supervised learning model that classifies data by finding the optimal hyperplane

8
New cards

K-means Clustering

An unsupervised learning algorithm that groups data into clusters based on similarity

9
New cards

Feature

An independent variable

10
New cards

Observation (Row)

An example or data point in the dataset

11
New cards

Accuracy

(TP+TN)/(TP+TN+FP+FN) – proportion of correct predictions

12
New cards

Precision

TP/(TP+FP) – proportion of predicted positives that are truly positive

13
New cards

Recall (Sensitivity/TPR)

TP/(TP+FN) – proportion of actual positives correctly predicted

14
New cards

Specificity

TN/(TN+FP) – proportion of actual negatives correctly predicted

15
New cards

F1 Score

2 * (Precision * Recall) / (Precision + Recall) – harmonic mean of precision and recall

16
New cards

Confusion Matrix

A table showing counts of TP

17
New cards

True Positive (TP)

Correctly predicted positive cases

18
New cards

False Positive (FP)

Incorrectly predicted as positive

19
New cards

False Negative (FN)

Incorrectly predicted as negative

20
New cards

True Negative (TN)

Correctly predicted negative cases

21
New cards

ROC Curve

A graphical plot of True Positive Rate (TPR) vs. False Positive Rate (FPR) at various thresholds

22
New cards

AUC (Area Under Curve)

A measure of the overall performance of a classifier across thresholds

23
New cards

Cross-validation

A resampling method (e.g., 5-fold) to evaluate model performance on different data splits

24
New cards

LASSO (Least Absolute Shrinkage and Selection Operator)

A regression-based method for feature selection

25
New cards

Concrete Autoencoder (CAE)

A deep learning method for feature selection using neural networks

26
New cards

Feature Extraction

Process of creating a new, smaller set of features that captures the most useful information from the original data

27
New cards

Principal Component Analysis (PCA)

A dimensionality reduction method for feature extraction

28
New cards

Autoencoder (AE)

A neural network that compresses and reconstructs data for feature extraction

29
New cards

Pan-cancer Analysis

Studying multiple cancer types to find common biomarkers using machine learning

30
New cards

Stochastic Behavior of CAE

The variability in results due to randomness in training runs of Concrete Autoencoder

31
New cards
Data Representation (Computer Science)
Data is represented as 0 and 1 in general computing
32
New cards
Data Representation (Machine Learning)
Data is represented as vectors and matrices
33
New cards
Scalar
A single number represented by a lowercase italic letter (e.g.
34
New cards
Vector
An ordered array of numbers
35
New cards
Matrix
A 2-D array of numbers identified by two indices
36
New cards
Tensor
An array with more than two axes
37
New cards
Categorical Variable
Discrete/qualitative variable
38
New cards
Nominal Variable
Categorical variable with two or more categories that have no intrinsic order
39
New cards
Ordinal Variable
Categorical variable with two or more categories that can be ordered or ranked
40
New cards
Continuous Variable
A variable that takes values on a continuous scale
41
New cards
Sample
An item to process (classify or cluster)
42
New cards
Feature Vector
An N-dimensional vector of numerical features representing a sample
43
New cards

Feature Selection

Process of filtering out irrelevant or redundant features while keeping a subset of the original features

44
New cards
Structured Data
Data organized in rows and columns (e.g.
45
New cards

Unstructured Data

Data without a predefined structure (e.g., text data), often converted into structured form like Bag-of-Words

46
New cards
Input Vector (𝑥𝑖)
Independent variable representing the ith sample
47
New cards
Response Variable (𝑦)
Dependent variable representing the outcome
48
New cards

Binary Classification (𝑦 ∈ {−1, 1} or {0, 1})

Task of predicting one of two classes

49
New cards
Multi-label Classification (𝑦 ∈ ℤ)
Task of predicting multiple discrete labels
50
New cards
Regression (𝑦 ∈ ℝ)
Predicting a continuous value
51
New cards
Principal Component Analysis (PCA)
A dimensionality reduction technique that transforms high-dimensional data into uncorrelated variables (principal components) capturing maximum variance
52
New cards
Principal Component (PC)
A linear combination of original variables that explains variance in data
53
New cards
PC1
Principal component explaining the largest variance in the dataset
54
New cards
PC2
Principal component explaining the next largest variance
55
New cards
Scree Plot
A plot showing the variance explained by each principal component
56
New cards
Step 1 of PCA
Standardization and centering data so variables share the same scale
57
New cards
Step 2 of PCA
Compute covariance matrix to identify relationships between variables
58
New cards
Step 3 of PCA
Eigen decomposition to identify eigenvectors (principal components) and eigenvalues (variance explained)
59
New cards
Eigenvector
A vector indicating a direction of maximum variance in data
60
New cards
Eigenvalue
A scalar indicating the amount of variance explained by its corresponding eigenvector
61
New cards
Step 4 of PCA
Select significant principal components to create a feature vector
62
New cards
Step 5 of PCA
Project data onto the new feature vector space (dimensionality reduction)
63
New cards
Applications of PCA
Data visualization
64
New cards
t-SNE (t-Distributed Stochastic Neighbor Embedding)
A nonlinear dimensionality reduction technique for visualizing high-dimensional data in 2D or 3D
65
New cards
t-SNE Purpose
Creates a visual “map” of high-dimensional data to reveal patterns and clusters
66
New cards
t-SNE Strength
Preserves local structure of data
67
New cards
Difference PCA vs. t-SNE
PCA preserves global structure (deterministic) while t-SNE preserves local structure (stochastic)
68
New cards
KL Divergence in t-SNE
Measures the difference between probability distributions in original and reduced dimensions
69
New cards
Model Parameters
Values determined using the training dataset
70
New cards
Hyperparameters
Values set before training that control the learning process (e.g.
71
New cards
t-SNE Hyperparameter: Components
Dimension of the embedded space
72
New cards

t-SNE Hyperparameter: Perplexity

, Effective number of nearest neighbors (recommended 5–50, must be < number of samples)

73
New cards
t-SNE Hyperparameter: Iterations
Number of optimization steps (≥250 recommended)
74
New cards

Scatter Plot

A plot showing relationships between two variables (e.g., Iris dataset)

75
New cards
Line Plot
Visualization showing how a variable changes with another
76
New cards
Histogram
Approximate representation of the distribution of numerical data (introduced by Karl Pearson)
77
New cards
Density Plot
Plot showing proportions of values in a distribution
78
New cards
Bar Plot
Visualization effective for categorical data with fewer than 10 categories
79
New cards

Box Plot

Visualization showing distributions with quartiles and medians (five-number summary: min, Q1, median, Q3, max)

80
New cards
Violin Plot
Visualization combining box plot and density plot to show distribution and probability density
81
New cards
LN_IC50
Log normalized IC50 representing drug dose sensitivity or resistance
82
New cards
Low IC50
Indicates tumor cells are more sensitive to the drug (better response)
83
New cards
High IC50
Indicates tumor cells are resistant to the drug (worse response)
84
New cards
Regression Models
Predict a continuous target variable such as drug response (IC50)
85
New cards
Five-Fold Cross Validation
Model evaluation method splitting data into 5 folds for training/testing
86
New cards
MAE (Mean Absolute Error)
Average of absolute differences between predictions and true values
87
New cards
MSE (Mean Squared Error)
Average of squared differences between predictions and true values
88
New cards
RMSE (Root Mean Squared Error)
Square root of MSE
89
New cards
R² (Coefficient of Determination)
Proportion of variance explained by the model
90
New cards
PCC (Pearson Correlation Coefficient)
Measure of linear correlation between predicted and actual values
91
New cards
CatBoost
Gradient boosting decision tree model shown to perform best for Docetaxel regression
92
New cards
SHAP (SHapley Additive exPlanations)
Explainable AI method for feature importance attribution
93
New cards
Global Feature Importance (Regression)
Average of absolute SHAP values across all samples
94
New cards
Local Feature Importance (Regression)
SHAP values for an individual sample prediction
95
New cards
Top 10 Genes (Docetaxel)
Most important genomic features identified by SHAP for drug response
96
New cards
Force Plot (SHAP)
Visualization showing how each feature contributes positively or negatively to prediction
97
New cards

Classification

Predicting categorical labels (e.g., cancer type) using input features

98
New cards
Five-Fold Cross Validation (Classification)
Evaluation method where classifier is trained/tested across 5 splits
99
New cards
Evaluation Metrics (Classification)
Accuracy
100
New cards
LightGBM
Gradient boosting framework identified as best classifier in experiments