Recently, self-training and active learning have been proposed to alleviate this problem. Self-training can improve model accuracy with massive unlabeled data, but some pseudo labels containing noise would be generated with limited or imbalanced training data. And there will be suboptimal models if human guidance is absent. Active learning can select more effective data to intervene, while the model accuracy can not be improved because the massive unlabeled data are not used. And the probability of querying sub-optimal samples will increase when the domain difference is too large, increasing annotation cost. This paper proposes an iterative loop learning method combining Self-Training and Active Learning (STAL) for domain adaptive semantic segmentation. The method first uses self-training to learn massive unlabeled data to improve model accuracy and provide more accurate selection models for active learning. Secondly, combined with the sample selection strategy of active learning, manual intervention is used to correct the self-training learning. Iterative loop to achieve the best performance with minimal label cost. Extensive experiments show that our method establishes state-of-the-art performance on tasks of GTAV to Cityscapes, SYNTHIA to Cityscapes, improving by 4.9% mIoU and 5.2% mIoU, compared to the previous best method, respectively. The code is available at https://github.com/licongguan/STAL.
翻译:最近,提出了自我培训和积极学习的建议,以缓解这一问题。自我培训可以提高模型精度,提供大量无标签数据,但有些含有噪音的假标签会以有限的或不平衡的培训数据生成。如果没有人的指导,则会出现亚最佳模型。积极学习可以选择更有效的数据进行干预,而模型精度则无法提高,因为没有使用大量无标签数据,因此无法改进模型精度。当域差太大时,查询亚最佳样本的概率会增加,增加批注费用。本文建议采用迭代循环学习方法,将自我培训和积极学习(STAL)结合起来,将区域适应性静语分解结合起来。这种方法首先使用自我培训来学习大量无标签数据,以提高模型精度,并为积极学习提供更准确的选择模型。第二,结合主动学习的抽样选择战略,使用人工干预来纠正自我培训学习。以最低标签成本实现最佳性能的循环。广泛实验表明,我们的方法将自学和主动学习(STAL)结合领域自学和主动学习(STAL)自学(STA)、自学/主动性学习(SYNTHA)的状态和自学方法分别用于城市的4.9、自学、自学/自学/自学方法。