Writing PyTorch Code for Binary Classification

0.0(0)

Studied by 0 people

Call with Kai

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/7

Earn XP

Description and Tags

Breast Cancer Dataset

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

8 Terms

New cards

You have a NumPy dataset for breast cancer classification and want to create training and test sets. You also need to ensure the data types are float32 for PyTorch. How do you do it?

We load the breast cancer dataset, convert X and Y to float32 (because PyTorch expects floating-point data), and then split into train/test sets using train_test_split.

<p>We load the breast cancer dataset, convert <code>X</code> and <code>Y</code> to <code>float32</code> (because PyTorch expects floating-point data), and then split into train/test sets using <code>train_test_split</code>.</p>

New cards

You want your neural network to process scaled data for better performance. How do you standardize the training set and then apply the same transformation to the test set?

fit_transform learns the scaling parameters (mean, std) from the training data, while transform applies the same parameters to the test data without re-fitting.

<p><code>fit_transform</code> learns the scaling parameters (mean, std) from the training data, while <code>transform</code> applies the same parameters to the test data without re-fitting.</p>

New cards

You need to define a simple feed-forward network to output a probability for binary classification. Which PyTorch modules do you use?

We use nn.Linear for a single-layer perceptron. The output is 1 node because it’s a binary classification task, and nn.Sigmoid squashes the output to a probability (0-1).

<p>We use <code>nn.Linear</code> for a single-layer perceptron. The output is 1 node because it’s a binary classification task, and <code>nn.Sigmoid</code> squashes the output to a probability (0-1).</p>

New cards

You have NumPy arrays for your features and labels, but PyTorch requires tensors for training. How do you convert them properly for a binary classification task?

We convert arrays to tensors using torch.tensor().
We use .view(-1, 1) on labels to match the model’s expected shape (one column per sample).

<ul><li><p>We convert arrays to tensors using <code>torch.tensor()</code>.</p></li><li><p>We use <code>.view(-1, 1)</code> on labels to match the model’s expected shape (one column per sample).</p></li></ul><p></p>

New cards

You want to set up a loss function and an optimizer for binary cross-entropy classification using PyTorch. Which ones should you choose and how?

nn.BCELoss is the standard for binary classification tasks.
optim.Adam is a commonly used optimizer for many neural network tasks because it adapts the learning rate for each parameter.

<ul><li><p><code>nn.BCELoss</code> is the standard for binary classification tasks.</p></li><li><p><code>optim.Adam</code> is a commonly used optimizer for many neural network tasks because it adapts the learning rate for each parameter.</p></li></ul><p></p>

New cards

You need to write a training loop for 50 epochs that prints out the loss at each iteration. Which steps must occur inside each epoch?

optimizer.zero_grad(): Clear any accumulated gradients from previous steps.
Forward pass: Get predictions from the model.
Calculate loss: Compare predictions to true labels.
loss.backward(): Calculate gradients for all parameters.
optimizer.step(): Update the model parameters based on the gradients.
Print/Save: Keep track of the loss to monitor performance.

<ol><li><p><code>optimizer.zero_grad()</code>: Clear any accumulated gradients from previous steps.</p></li><li><p><strong>Forward pass</strong>: Get predictions from the model.</p></li><li><p><strong>Calculate loss</strong>: Compare predictions to true labels.</p></li><li><p><code>loss.backward()</code>: Calculate gradients for all parameters.</p></li><li><p><code>optimizer.step()</code>: Update the model parameters based on the gradients.</p></li><li><p><strong>Print/Save</strong>: Keep track of the loss to monitor performance.</p></li></ol><p></p>

New cards

You suspect your model might be overfitting after 50 epochs. How can you monitor or reduce overfitting in your current setup?

Monitor:
1. Compare training loss vs. test loss after each epoch.
2. Use metrics such as accuracy on both training and test sets.
Reduce:
1. Implement regularization (e.g., add a dropout layer).
2. Use early stopping (stop training when test loss stops improving).
3. Collect more data or perform data augmentation if possible.
  Explanation:
  Monitoring train/test performance reveals whether the model memorizes the training set at the expense of test accuracy (overfitting). Techniques like dropout, early stopping, or additional data often help.

New cards

After training, how would you evaluate your model on the test set to get an accuracy score?

with torch.no_grad(): disables gradient computation (faster inference).
We threshold the output at 0.5 to get 0/1 predictions.
Then we compute the fraction of correct labels for accuracy.

$<ul><li><code>with torch.no_grad():</code> disables gradient computation (faster inference).</li><li>We threshold the output at 0.5 to get 0/1 predictions.</li><li>Then we compute the fraction of correct labels for accuracy.</li></ul>$