SegNet CamVid dataset training classes mismatch?

Question

This is with reference to the CamVid dataset and one of its tutorial here: http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html

I'm quite confused by how the model is supposed to be trained on 11 classes, when there are 12 classes in the ground truth segmentation, which includes the void class of 0. How is the network able to predict the 11 classes correctly if it is trained on 12 classes?

Also, how is the network able to know it shouldn't predict a class 0 void and how is it ensured that such prediction would result in no increase/decrease of the loss function?

I am guessing there could be a weighting function involved in the loss function calculation, i.e. void classes have weights of 0 and the rest is something like 1.0. Could anyone confirm this?

Faur · Accepted Answer · 2018-01-05 08:38:31Z

0

You should predict 11 channels, and compute the loss based on the 11 valid classes. The void class (technically $1-void\_class$) is used to mask the loss. I.e. you should split the 12 channel label you have to a 11 channel target and a 1 channel mask.

answered Jan 5, 2018 at 8:38

Faur

2011 silver badge6 bronze badges

Add a comment |

Stack Exchange Network

SegNet CamVid dataset training classes mismatch?

1 Answer 1

Your Answer

Hot Network Questions

SegNet CamVid dataset training classes mismatch?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions