What are some best practices for labeling data that exists in a continuum?

Question

I am building computer vision models on data that exists in a continuum. For example, imagine I'm trying to do semantic segmentation on cars. Some of the labels are distinct, like "chipped paint", but others exist on a continuum, like "not dirty", "a little dirty", "moderately dirty", or "filthy". I can create descriptions of each label, for example:

"a little dirty" means having few visible stains or dust
"moderately dirty" means having multiple stains and a significant amount of dust.

But this doesn't really fix the problem. I'm curious about what the best practices are here. One option is to have the classes mentioned above. The problem with this is if something is halfway between "a little dirty" and "moderately dirty", how should I label that pixel?

Jonas Mueller · Accepted Answer · 2023-01-22 00:28:05Z

One option is to just treat this as a standard multiclass classification task. After training a model, you can use methods that look for overlapping labels like this to determine whether you might want to merge some of your classes, or define them a bit more distinctly.

Another option after training a standard multi-class classifier, is to use an assymetric loss function to evaluate the resulting model (eg. for tuning its hyperparameters). The loss used for evaluation could more heavily penalize a prediction of "filthy" vs a prediction of "moderately dirty" when the annotated label is "a little dirty".

A third option that is most "proper" is to use ordinal classification models. However in my experience, this is just mathematically more elegant but in practice does not give better results than a properly-applied standard multi-class classifier.

Stack Exchange Network

What are some best practices for labeling data that exists in a continuum?

1 Answer 1

Your Answer

Hot Network Questions

What are some best practices for labeling data that exists in a continuum?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Related

Hot Network Questions