Domain adaptive semantic segmentation methods commonly utilize stage-wise training, consisting of a warm-up and a self-training stage. However, this popular approach still faces several challenges in each stage: for warm-up, the widely adopted adversarial training often results in limited performance gain, due to blind feature alignment; for self-training, finding proper categorical thresholds is very tricky. To alleviate these issues, we first propose to replace the adversarial training in the warm-up stage by a novel symmetric knowledge distillation module that only accesses the source domain data and makes the model domain generalizable. Surprisingly, this domain generalizable warm-up model brings substantial performance improvement, which can be further amplified via our proposed cross-domain mixture data augmentation technique. Then, for the self-training stage, we propose a threshold-free dynamic pseudo-label selection mechanism to ease the aforementioned threshold problem and make the model better adapted to the target domain. Extensive experiments demonstrate that our framework achieves remarkable and consistent improvements compared to the prior arts on popular benchmarks. Codes and models are available at https://github.com/fy-vision/DiGA
翻译:域自适应语义分割方法通常采用阶段性训练,包括热身和自训练阶段。然而,这种普遍方法在每个阶段仍然面临着几个挑战:对于热身训练,广泛采用的对抗性训练由于盲目的特征对齐而导致性能提升有限;对于自训练,寻找合适的分类阈值非常棘手。为了缓解这些问题,我们首先提出用一种新的对称知识蒸馏模块代替热身阶段中的对抗性训练,该模块仅访问源领域数据,并使模型具有领域通用性。令人惊讶的是,这种具有领域通用性的热身模型带来了显著的性能改进,并通过我们提出的跨领域混合数据增强技术进一步放大。然后,对于自训练阶段,我们提出了一种无阈值动态伪标签选择机制,以缓解前面提到的阈值问题,并使模型更好地适应目标领域。大量实验证明,相对于现有技术,我们的框架在流行的基准测试中实现了显着而一致的改进。代码和模型可在 https://github.com/fy-vision/DiGA 上获得。