语义分离中无人监督域域适应的矛盾学习和自我培训 (Contrastive Learning and Self-Training for Unsupervised Domain Adaptation in Semantic Segmentation)

Deep convolutional neural networks have considerably improved state-of-the-art results for semantic segmentation. Nevertheless, even modern architectures lack the ability to generalize well to a test dataset that originates from a different domain. To avoid the costly annotation of training data for unseen domains, unsupervised domain adaptation (UDA) attempts to provide efficient knowledge transfer from a labeled source domain to an unlabeled target domain. Previous work has mainly focused on minimizing the discrepancy between the two domains by using adversarial training or self-training. While adversarial training may fail to align the correct semantic categories as it minimizes the discrepancy between the global distributions, self-training raises the question of how to provide reliable pseudo-labels. To align the correct semantic categories across domains, we propose a contrastive learning approach that adapts category-wise centroids across domains. Furthermore, we extend our method with self-training, where we use a memory-efficient temporal ensemble to generate consistent and reliable pseudo-labels. Although both contrastive learning and self-training (CLST) through temporal ensembling enable knowledge transfer between two domains, it is their combination that leads to a symbiotic structure. We validate our approach on two domain adaptation benchmarks: GTA5 $\rightarrow$ Cityscapes and SYNTHIA $\rightarrow$ Cityscapes. Our method achieves better or comparable results than the state-of-the-art. We will make the code publicly available.

翻译：深革命神经网络大大改善了语义分割的最新结果。尽管如此, 即使现代建筑也缺乏能力, 无法全面推广来自不同领域的测试数据集。为了避免对隐蔽域的培训数据进行昂贵的批注, 没有监督的域适应(UDA) 试图提供从标签源域向无标签目标域的有效知识转移。先前的工作主要侧重于使用对抗性培训或自我培训, 最大限度地缩小这两个领域之间的差异。对抗性培训可能无法将正确的语义类别与来自不同领域的测试数据集相匹配, 而自我培训则提出了如何提供可靠的假标签的问题。为避免对隐蔽域的培训数据进行成本高昂的批注, 我们提出了一种对比性学习方法, 将类别偏重的类的类固固固固的子转换到一个目标域域。我们推广了自我培训的方法, 使用记忆效率高的时间组合来生成一致和可靠的假称标签。尽管对比性学习和自我培训( CLST) 通过时间级的加密市级分类, 将使我们的域级校准数据转换为两个区域。我们的域法化方法将使我们的校正的校正的校正的校正的校正法将实现两个域的校正。