Since the rise of deep learning, many computer vision tasks have seen significant advancements. However, the downside of deep learning is that it is very data-hungry. Especially for segmentation problems, training a deep neural net requires dense supervision in the form of pixel-perfect image labels, which are very costly. In this paper, we present a new loss function to train a segmentation network with only a small subset of pixel-perfect labels, but take the advantage of weakly-annotated training samples in the form of cheap bounding-box labels. Unlike recent works which make use of box-to-mask proposal generators, our loss trains the network to learn a label uncertainty within the bounding-box, which can be leveraged to perform online bootstrapping (i.e. transforming the boxes to segmentation masks), while training the network. We evaluated our method on binary segmentation tasks, as well as a multi-class segmentation task (CityScapes vehicles and persons). We trained each task on a dataset comprised of only 18% pixel-perfect and 82% bounding-box labels, and compared the results to a baseline model trained on a completely pixel-perfect dataset. For the binary segmentation tasks, our method achieves an IoU score which is ~98.33% as good as our baseline model, while for the multi-class task, our method is 97.12% as good as our baseline model (77.5 vs. 79.8 mIoU).
翻译:自深层学习升起以来,许多计算机视觉任务都取得了显著的进步。然而,深层学习的缺点在于它非常缺乏数据。特别是对于分解问题,深神经网的培训要求以像素完美图像标签的形式进行密集的监督,这些标签成本很高。在本文中,我们展示了一个新的损失功能,以训练一个只有一小部分像素完美标签的分解网络,但利用以廉价的捆绑标签形式提供的微弱附加说明的培训样本。不同于最近利用箱对软件建议生成器的工程,我们的损失是培训网络,以在捆绑框中学习标签不确定性,这些标签可以被用来进行在线制动(即将盒转换成分解面具),同时对网络进行培训。我们评估了我们二进制分解任务的方法,以及多级分解模式(CityScapes车和人)。我们训练了每个任务都是由18 %的比素-直线和82%的分解建议生成的。对于一个精密的分解模型,作为我们经过训练的分解模式的分解方法,我们的数据,我们作为一个经过全面分解的分解的分解的分解的分解的分解方法,我们的数据是用来的分解方法。