Self-training via pseudo labeling is a conventional, simple, and popular pipeline to leverage unlabeled data. In this work, we first construct a strong baseline of self-training (namely ST) for semi-supervised semantic segmentation via injecting strong data augmentations (SDA) on unlabeled images to alleviate overfitting noisy labels as well as decouple similar predictions between the teacher and student. With this simple mechanism, our ST outperforms all existing methods without any bells and whistles, e.g., iterative re-training. Inspired by the impressive results, we thoroughly investigate the SDA and provide some empirical analysis. Nevertheless, incorrect pseudo labels are still prone to accumulate and degrade the performance. To this end, we further propose an advanced self-training framework (namely ST++), that performs selective re-training via prioritizing reliable unlabeled images based on holistic prediction-level stability. Concretely, several model checkpoints are saved in the first stage supervised training, and the discrepancy of their predictions on the unlabeled image serves as a measurement for reliability. Our image-level selection offers holistic contextual information for learning. We demonstrate that it is more suitable for segmentation than common pixel-wise selection. As a result, ST++ further boosts the performance of our ST. Code is available at https://github.com/LiheYoung/ST-PlusPlus.
翻译:通过假标签进行自我培训是一种常规、简单和流行的管道,可以利用未贴标签的数据。 在这项工作中,我们首先通过在未贴标签的图像上注入强大的数据增强器(SDA),在未贴标签的图像上为半监督的语义分化建立一个强大的自我培训(即ST)基线(即ST),以缓解过度贴杂的标签以及教师和学生之间的类似预测。有了这一简单机制,我们的ST在没有任何钟声和哨子的情况下,表现优于所有现有方法,例如迭代再培训。在令人印象深刻的结果的启发下,我们彻底调查了SDA,并提供了一些经验分析。然而,不正确的假标签仍然容易积累和降低性能。为此,我们进一步提议了一个先进的自我培训框架(即ST++),通过根据整体预测的稳定性确定可靠的未贴标签的图像的优先次序进行选择性的再培训。具体地说,在第一阶段的监督下保留了几个示范检查站,在未贴标签的图像上的预测有差异性能测量可靠性。我们的图像级别选择提供了一种整体背景信息,用于学习。我们在SST+L的成绩中更适合。我们在共同的SDL选择。我们在SDL进行。