2
$\begingroup$

I'm a beginner who just started to study deep learning. I recently learned that in a feedforward neural network with a binary output and a Bernoulli distribution, the output of the sigmoid function represents the probability that the label is 1. I`m curious why it cannot be the other way round (probability of the label being 0). Is it just for the convenience?

$\endgroup$
2
  • 3
    $\begingroup$ Related...you don't even have to use the 0/1 convention. $\endgroup$ Commented Jul 24, 2024 at 16:27
  • $\begingroup$ The post helped greatly. Thanks! $\endgroup$ Commented Jul 24, 2024 at 23:31

1 Answer 1

4
$\begingroup$

It is a convention.

Ultimately, what is important is that the objective function, likely to be log-likelihood in your context has to be computed based on the convention that you have chosen.

It would be great to follow the convention that most have adopted to reduce the risk of miscommunication/ misinterpretation.

$\endgroup$

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.