This paper studies semi-supervised learning of semantic segmentation, which assumes that only a small portion of training images are labeled and the others remain unlabeled. The unlabeled images are usually assigned pseudo labels to be used in training, which however often causes the risk of performance degradation due to the confirmation bias towards errors on the pseudo labels. We present a novel method that resolves this chronic issue of pseudo labeling. At the heart of our method lies error localization network (ELN), an auxiliary module that takes an image and its segmentation prediction as input and identifies pixels whose pseudo labels are likely to be wrong. ELN enables semi-supervised learning to be robust against inaccurate pseudo labels by disregarding label noises during training and can be naturally integrated with self-training and contrastive learning. Moreover, we introduce a new learning strategy for ELN that simulates plausible and diverse segmentation errors during training of ELN to enhance its generalization. Our method is evaluated on PASCAL VOC 2012 and Cityscapes, where it outperforms all existing methods in every evaluation setting.
翻译:本文研究半受监督的语义分解学习, 假设只有一小部分培训图像被贴上标签, 其他图像未被贴上标签。 未贴上标签的图像通常被指定在培训中使用假标签, 但通常由于对伪标签错误的确认偏好而导致性能退化的风险。 我们提出了一个新颖的方法来解决伪标签这一长期存在的问题。 我们的方法核心是误差本地化网络(ELN), 这是一种辅助模块, 将图像及其分解预测作为输入, 并识别假标签可能出错的像素。 ELN 使得半封印的学习能够通过在培训中忽略标签的噪音而对不准确的假标签保持稳健, 并且可以自然地与自我培训和对比性学习相结合。 此外, 我们为ELN 引入了一种新的学习策略, 以模拟在培训ELN 以强化其普遍性过程中出现的可信和多样的分解错误。 我们的方法在 PASAL VOC 2012 和 Cityscorps 上进行了评估, 它在每一个评估环境中都超越了所有现有的方法 。