Unsupervised domain adaptation (UDA) for semantic segmentation has been well-studied in recent years. However, most existing works largely neglect the local regional consistency across different domains and are less robust to changes in outdoor environments. In this paper, we propose a novel and fully end-to-end trainable approach, called regional contrastive consistency regularization (RCCR) for domain adaptive semantic segmentation. Our core idea is to pull the similar regional features extracted from the same location of different images, i.e., the original image and augmented image, to be closer, and meanwhile push the features from the different locations of the two images to be separated. We innovatively propose a region-wise contrastive loss with two sampling strategies to realize effective regional consistency. Besides, we present momentum projection heads, where the teacher projection head is the exponential moving average of the student. Finally, a memory bank mechanism is designed to learn more robust and stable region-wise features under varying environments. Extensive experiments on two common UDA benchmarks, i.e., GTAV to Cityscapes and SYNTHIA to Cityscapes, demonstrate that our approach outperforms the state-of-the-art methods.
翻译:近些年来,对未经监督的语义分化域适应(UDA)进行了广泛的研究,然而,大多数现有工作在很大程度上忽视了不同领域的地方区域一致性,对室外环境的变化则不那么强大。在本文件中,我们提出了一种新的、完全端对端的可培训方法,称为区域差异一致性规范(RCCR),用于地区适应语义分化。我们的核心想法是将不同图像的不同位置(即原始图像和增强图像)的类似区域特征拉近,并同时将两种图像的不同位置的特征分开。我们创新地提出了一种区域角度的对比性损失,采用两种抽样战略来实现有效的区域一致性。此外,我们提出了动力投影头,教师投影头是学生的指数移动平均数。最后,一个记忆库机制旨在在不同环境中学习更稳健和稳定的区域特征。在两种通用的UDA基准上进行了广泛的实验,即GAVA到城市景点和SYNTHHIA,表明我们的方法超越了城市景色的状态方法。