Deep supervised models have an unprecedented capacity to apsorb large quantities of training data. Hence, training on all available datasets appears as a feasible approach towards accurate semantic segmentation models with graceful degradation in unusual scenes. Unfortunately, different datasets often use incompatible labels. For instance, the Cityscapes road class subsumes all pixels on driving surfaces, while Vistas defines separate classes for road markings, zebra crossings etc. Such inconsistencies pose a major obstacle towards successful multi-domain learning. We address this challenge by proposing a principled technique for learning with incompatible labeling policies. Different than recent related work, our technique allows seamless training on datasets with overlapping classes. Consequently, it can learn visual concepts which are not represented as a separate class in any of the individual datasets. We evaluate our method on a collection of seven semantic segmentation datasets across four different domains. The results exceed the state of the art in multi-domain semantic segmentation.
翻译:深层监督的模型具有前所未有的能力来模拟大量的培训数据。 因此,关于所有可用数据集的培训似乎是在异常场景中优雅降解的准确语义分解模型的可行方法。 不幸的是,不同的数据集往往使用不兼容的标签。 例如,城市景观路级分类子集所有驾驶表面的像素,而Vistas则为道路标识、斑马交叉点等分别确定分类。 这种不一致性是成功多域学习的主要障碍。 我们通过提出一种有原则的学习技术来应对这一挑战, 与最近的相关工作不同, 我们的技术允许对重叠类的数据集进行无缝无缝的培训。 因此, 它可以学习视觉概念, 这些概念在任何一个单个数据集中不作为单独的分类。 我们评估了我们收集七个语义分解数据集的方法, 跨越四个不同区域。 其结果超过了多域语义分解的艺术状态 。