intro to ml

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/14

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 3:06 PM on 5/12/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

15 Terms

1
New cards

1. In supervised learning, using dropout as a regularization technique during training helps prevent overfitting by temporarily removing certain connections in the model.

true

2
New cards

For learning models that employ gradient descent, adding a regression term (regularization) will require a change to the gradient used by the update rule.

true

3
New cards

MAE gives higher penalties to larger errors compared to MSE

false

4
New cards

Feedforward neural network with one hidden layers containing a sufficient number of neurons can approximate any continuous function on a closed and bounded interval,

given appropriate activation functions

true

5
New cards

L2 is a regularization technique that is applied only during the training phase of supervised learning, while all neurons are utilized during the testing phase.

true

6
New cards

Which of the following algoithms are superised learning?

Support Vector Machines (SVM)

Linear Regression

Gradient Boosting Machines

7
New cards

Which of the following statements about autoencoders and PCA are not true

PCA does not require labeled data. It uses eigenvectors/eigenvalues from the covariance matrix, which is purely unsupervised.

  • PCA is a supervised learning method, whereas autoencoders are always unsupervised.

8
New cards

Which of the following are not the characteristics of the Naive Bayes classifier?

  • It requires large amounts of training data to perform well.

  • It is sensitive to the correlation between features.

  • It doesn't perform well with a small amount of training data, especially in text classification tasks.

9
New cards

Which of the following statements are not true about the softmax layer?

  • The softmax layer is typically applied to hidden layers in deep networks to introduce nonlinearity.

  • The softmax function ensures that all outputs are non-negative and bounded between -1 and 1.

10
New cards

Which of the following statements are true about the sigmoid activation function? (8 pts)

Sigmoid outputs values between 0 and 1, making it suitable for binary
classification tasks.

Sigmoid can cause the vanishing gradient problem because its gradients become very small for large input values.

11
New cards

Cross-entropy loss:

Is commonly used for classification

Measures the difference between true labels and predicted probabilities

Is minimized when predicted probabilities match the true distribution

Can be used with softmax outputs

12
New cards

Which of the following are offective strategies for handling class imbalance in

clasalfication tasks?

Applying a Class-balanced lose function to give higher penalties for misclassifying minority class instances.

Using SMOTE (Synthetlo Minority Over-sampling Technique) to create synthetic instances of the minority class by interpolating between existing minority samples.

utilizing undersampling of the majority class to reduce its representation in the dataset, thereby balancing class distributions

13
New cards

Which of the following are true about standard scaling of features?

Standard scaling transforms the data by shifting it to have a mean of zero and a

standard deviation of one.

Standard scaling helps prevent the model from being biased toward features with larger magnitudes.

14
New cards

Which of the following statements about linear PCA is correct? (select one) (7 pts)

The sum of the eigenvalues produced by PCA equals the total variance of the dataset.

The eigenvector with largest eigenvalue is the direction along which the projection of the data has highest variance

15
New cards