Deep learning in the presence of noisy annotations has been studied extensively in classification, but much less in segmentation tasks. In this work, we study the learning dynamics of deep segmentation networks trained on inaccurately-annotated data. We discover a phenomenon that has been previously reported in the context of classification: the networks tend to first fit the clean pixel-level labels during an "early-learning" phase, before eventually memorizing the false annotations. However, in contrast to classification, memorization in segmentation does not arise simultaneously for all semantic categories. Inspired by these findings, we propose a new method for segmentation from noisy annotations with two key elements. First, we detect the beginning of the memorization phase separately for each category during training. This allows us to adaptively correct the noisy annotations in order to exploit early learning. Second, we incorporate a regularization term that enforces consistency across scales to boost robustness against annotation noise. Our method outperforms standard approaches on a medical-imaging segmentation task where noises are synthesized to mimic human annotation errors. It also provides robustness to realistic noisy annotations present in weakly-supervised semantic segmentation, achieving state-of-the-art results on PASCAL VOC 2012.
翻译:在分类方面,对在噪音说明中进行的深层学习进行了广泛的研究,但较少在分解任务中进行深入的研究。在这项工作中,我们研究了在不准确附加说明的数据方面受过训练的深层分解网络的学习动态。我们发现了先前在分类方面报告的一种现象:网络在“早期学习”阶段中往往首先适合清洁的像素等级标签,然后最终将虚假说明进行记忆化。然而,与分类相比,在所有语义分类类别中,分解的中间化并非同时产生。根据这些发现,我们提出了一种用两个关键要素从噪音说明中分离出来的新的方法。首先,我们在培训期间分别检测了每个类别中开始的中间化阶段。这使我们能够适应性地纠正扰动性说明,以便利用早期学习。第二,我们纳入了一个规范化术语,使跨尺度的一致性得以增强对批注噪音的稳健性。我们的方法在医学分解任务中超越了标准方法,其中将噪音合成为模拟人类分解错误。首先,我们还在2012年对现实的静态分解结果进行稳健化。