AML SA1

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/39

There's no tags or description

Looks like no tags are added yet.

Last updated 9:27 PM on 6/14/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

40 Terms

New cards

What evaluation metric would you most likely avoid in an imbalanced binary classification problem?

Accuracy

New cards

Dropout is used to increase model complexity.

False

New cards

Adam optimizer uses both first and second moment estimates.

True

New cards

L1 regularization uses which mathematical approach?

Absolute value of weights

New cards

Regularization is most helpful when:

The model is overfitting

New cards

Training loss should always be lower than test loss in a good model.

True

New cards

A confusion matrix provides insight into:

Classification performance

New cards

What type of validation helps reduce model variance by rotating the validation set?

K-fold Cross Validation

New cards

Mini-batch gradient descent typically converges faster than batch gradient descent.

True

New cards

When is learning rate scheduling particularly useful?

When training loss plateaus

New cards

Batch Gradient Descent differs from Mini-Batch in that it:

Uses the entire dataset to compute a single update

New cards

What does increasing the dropout rate typically do?

Reduce overfitting by increasing neuron variability

New cards

In gradient descent, a smaller learning rate generally leads to:

Slower, more stable convergence

New cards

AUC measures the model's ability to classify correctly at various thresholds.

True

New cards

Which technique disables random neurons during training?

Dropout

New cards

The area under the ROC curve indicates:

Discriminative ability of a model

New cards

Regularization increases the model's training accuracy.

False

New cards

Feature engineering is not part of the ML pipeline.

False

New cards

The Adam optimizer is considered superior to vanilla SGD because it:

Adapts learning rates and includes momentum

New cards

What does early stopping monitor to determine when to halt training?

Validation performance

New cards

R-squared is a metric used in classification.

False

New cards

Cross-validation helps detect if a model is underfitting.

True

New cards

Which loss function penalizes larger errors more significantly?

MSE

New cards

What is the primary trade-off involved in setting a learning rate too high?

Risk of overshooting the minimum

New cards

A regularization technique that results in feature selection is:

Regularization

New cards

Validation loss is often used to trigger early stopping.

True

New cards

Which of the following describes a characteristic of supervised learning?

It predicts outputs using labeled datasets

New cards

A low learning rate can result in slow but stable convergence.

True

New cards

ReLU is a commonly used loss function in classification.

False

New cards

Overfitting usually occurs when the model is too simple.

False

New cards

What is the role of the test set in model development?

To estimate real-world performance

New cards

Which of the following optimizers maintains a running average of past squared gradients?

RMSProp

New cards

What component in optimization helps reduce oscillation and improve directionality?

Momentum

New cards

In classification tasks, which metric is most concerned with minimizing false negatives?

Recall

New cards

Which metric is best suited for regression tasks?

Mean Absolute Error

New cards

In the machine learning workflow, what is the main purpose of feature engineering?

Enhance data representation for better learning

New cards

The main goal of optimization is to increase accuracy on the test set.

False

New cards

RMSProp improves over SGD by:

Scaling learning rates by past gradient magnitudes

New cards

RMSProp maintains a history of past gradients.

True

New cards

What is the primary reason for using L2 regularization?

It penalizes large weights to reduce overfitting