Final Layer and Inference with CE vs BCE

Ask Question

Asked 3 years, 5 months ago

Modified 3 years, 5 months ago

Viewed 737 times

I have read a similar question here: 1 neuron BCE loss VS 2 neurons CE loss that suggests there is no difference between softmax cross entropy loss and binary cross entropy loss, when choosing between two categories, since we can use complementary probability of a bernoulli distribution q = 1-p.

In practice, there is a difference because of different activation functions: BCE loss uses sigmoid activation, whereas CE loss uses softmax activation. CE(Softmax(X),Y)[0] ≠ BCE(Sigmoid(X[0]),Y[0]) $X,Y\in\mathbb{R}^{1\times2}$ for predictions and labels respectively. The other nuance is that the number of neurons in the final layer.

Although we could discard the information from the second neuron after applying softmax during inference, since it's just complementary of first... If the number of neurons in the hidden layer is 40 followed by 2 in the final layer we get 80 weight and 2 bias parameters, whereas the second diagram yield only 40 + 1.

If you were to design a neural network and choose between the two, what would you advise? I suspect the results would be similar but not exactly equal. The author of a linked question might have implemented it incorrectly if that is not the case. Yet, I'm still not quite sure about the difference in practice.

asked Jun 6, 2022 at 14:24

Anonymous

1815 bronze badges

Add a comment |

0 Your Answer

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Stack Exchange Network

Final Layer and Inference with CE vs BCE

0

Your Answer

Linked

Hot Network Questions

Final Layer and Inference with CE vs BCE

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Hot Network Questions