Recently, self-training and active learning have been proposed to alleviate this problem. Self-training can improve model accuracy with massive unlabeled data, but some pseudo labels containing noise would be generated with limited or imbalanced training data. And there will be suboptimal models if human guidance is absent. Active learning can select more effective data to intervene, while the model accuracy can not be improved because the massive unlabeled data are not used. And the probability of querying sub-optimal samples will increase when the domain difference is too large, increasing annotation cost. This paper proposes an iterative loop learning method combining Self-Training and Active Learning (STAL) for domain adaptive semantic segmentation. The method first uses self-training to learn massive unlabeled data to improve model accuracy and provide more accurate selection models for active learning. Secondly, combined with the sample selection strategy of active learning, manual intervention is used to correct the self-training learning. Iterative loop to achieve the best performance with minimal label cost. Extensive experiments show that our method establishes state-of-the-art performance on tasks of GTAV to Cityscapes, SYNTHIA to Cityscapes, improving by 4.9% mIoU and 5.2% mIoU, compared to the previous best method, respectively. Code will be available.
翻译:最近,提出了自我培训和积极学习的建议,以缓解这一问题。自我培训可以提高模型精度,提供大量无标签数据,但有些含有噪音的假标签会以有限的或不平衡的培训数据生成。如果没有人的指导,则会出现亚最佳模型。积极学习可以选择更有效的数据来干预,而模型精度则无法提高,因为没有使用大量无标签数据。当域差异太大时,查询亚最佳样本的概率会增加,增加批注费用。本文提出一种迭代循环学习方法,将自我培训和积极学习(STAL)结合起来,结合领域适应性语义分隔。这种方法首先利用自我培训学习大规模无标签数据来提高模型精度,并为积极学习提供更准确的选择模型。第二,结合主动学习的抽样选择战略,使用人工干预来纠正自我培训学习。用最低标签成本实现最佳性效果的循环。广泛的实验表明,我们的方法将自学和主动学习(STAL)结合自学和主动学习(STAL)在领域适应性语义中结合自学,自学为自学,自学和主动性学习(STI)自学,将分别用于城市代码4.9%和自学方法。