The Jaccard index, also referred to as the intersection-over-union score, is commonly employed in the evaluation of image segmentation results given its perceptual qualities, scale invariance - which lends appropriate relevance to small objects, and appropriate counting of false negatives, in comparison to per-pixel losses. We present a method for direct optimization of the mean intersection-over-union loss in neural networks, in the context of semantic image segmentation, based on the convex Lov\'asz extension of submodular losses. The loss is shown to perform better with respect to the Jaccard index measure than the traditionally used cross-entropy loss. We show quantitative and qualitative differences between optimizing the Jaccard index per image versus optimizing the Jaccard index taken over an entire dataset. We evaluate the impact of our method in a semantic segmentation pipeline and show substantially improved intersection-over-union segmentation scores on the Pascal VOC and Cityscapes datasets using state-of-the-art deep learning segmentation architectures.
翻译:在评价图像分割结果时,通常使用 " 雅克 " 指数(又称 " 跨工会分数 " ),因为其感知特性和规模差异 -- -- 与小物体有适当关系,与每像素损失相比,对假负数进行适当计算。我们提出了一个方法,在语义图像分割方面,在神经网络中,在语义图像分割方面,根据亚模数损失的 convex Lov\'asz延伸,直接优化中跨工会平均损失。这一损失显示,与以往使用的跨职业损失相比,在 " 雅克 " 指数计量方面表现得更好。我们显示了优化每个图像的 " 雅克 " 指数与优化在整个数据集中采用的 " 雅克 " 指数 " 之间的定量和定性差异。我们评估了我们的方法在语义分割管道中的影响,并展示了利用最先进的深层学习分解结构对Pascal VOC 和 " 城景区 " 数据集的跨工会分数的大幅改进。