Semantic segmentation is an important task for intelligent vehicles to understand the environment. Current deep learning methods require large amounts of labeled data for training. Manual annotation is expensive, while simulators can provide accurate annotations. However, the performance of the semantic segmentation model trained with the data of the simulator will significantly decrease when applied in the actual scene. Unsupervised domain adaptation (UDA) for semantic segmentation has recently gained increasing research attention, aiming to reduce the domain gap and improve the performance on the target domain. In this paper, we propose a novel two-stage entropy-based UDA method for semantic segmentation. In stage one, we design a threshold-adaptative unsupervised focal loss to regularize the prediction in the target domain, which has a mild gradient neutralization mechanism and mitigates the problem that hard samples are barely optimized in entropy-based methods. In stage two, we introduce a data augmentation method named cross-domain image mixing (CIM) to bridge the semantic knowledge from two domains. Our method achieves state-of-the-art 58.4% and 59.6% mIoUs on SYNTHIA-to-Cityscapes and GTA5-to-Cityscapes using DeepLabV2 and competitive performance using the lightweight BiSeNet.
翻译:语义分解是智能工具了解环境的一项重要任务。 目前深层次的学习方法需要大量的标签数据来进行培训。 人工注解费用昂贵, 模拟器可以提供准确的注释。 但是, 当实际应用模拟器数据时, 使用模拟器数据培训的语义分解模型的性能会大大降低。 用于语义分解的不受监督域适应( UDA)最近引起了越来越多的研究关注, 目的是缩小域间差距, 改善目标域的性能。 在本文中, 我们建议了一个新的两阶段基于 UDA 的语义分解双阶段方法。 在第一阶段, 我们设计了一个门槛适应性、 且不受监督的焦点损失, 以规范目标域的预测, 该目标域具有轻度的梯度中性机制, 并缓解了硬样品在基于摄取方法中几乎无法优化的问题。 在第二阶段, 我们引入了名为跨面图像混合的数据增强方法, 以连接两个域的语义知识。 我们的方法使用SY- 和59.6%的SYC- 和 GHIS- 和59. 节流- L.