Target and output shape/type for binary classification using PyTorch

Question

so I have some annotated images that I want to use to train a binary image classifier but I have been having issues creating the dataset and actually getting a test model to train. Each image is either of a certain class or not so I want to set up a binary classification dataset/model using PyTorch. I had some questions:

should labels be float or long?
what shape should my labels be?
I am using a resnet18 class from torchvision model, should my final softmax layer have one or two outputs?
what shapes should my target be, during training, if my batch size is 200?
what shape should my outputs be?

Thanks in advance

Quote Delete

Shai · Accepted Answer · 2021-03-01 06:20:27Z

3

Binary classification is slightly different than multi-label classification: while for multilabel your model predicts a vector of "logits", per sample, and uses softmax to converts the logits to probabilities; In the binary case, the model predicts a scalar "logit", per sample, and uses the sigmoid function to convert it to class probability.

In pytorch the softmax and the sigmoind are "folded" into the loss layer (for numerical stability considerations) and therefore there are different Cross Entropy loss layers for the two cases: nn.BCEWithLogitsLoss for the binary case (with sigmoid) and nn.CrossEntropyLoss for the multilabel case (with softmax).

In your case you want to use the binary version (with sigmoid): nn.BCEWithLogitsLoss.
Thus your labels should be of type torch.float32 (same float type as the output of the network) and not integers. You should have a single label per sample. Thus, if your batch size is 200, the target should have shape (200,1).

I'll leave it here as an exercise to show that training a model with two outputs and CE+softmax is equivalent to binary output+sigmoid ;)

answered Mar 1, 2021 at 6:20

Shai

115k39 gold badges259 silver badges398 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Mohamed Moustafa Over a year ago

Thanks a lot for this, I followed the above steps and the network runs now. However! my loss does not change at all for some reason. Any idea why that might happen?

Shai Over a year ago

@MohamedMoustafa there can be many reasons for this. Try changing the learning rate

Mohamed Moustafa Over a year ago

I managed to fix the issue. I was putting sigmoid at the end of my fully connected layers (I edited the resnet18 from torchvision). Removing the sigmoid from the network seemed to fix the issue.

Shai Over a year ago

@MohamedMoustafa using BCEWithLogits makes the explicit use of sigmoid redundant.

Collectives™ on Stack Overflow

Target and output shape/type for binary classification using PyTorch

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related