In this work we challenge the common approach of using a one-to-one mapping ('translation') between the source and target domains in unsupervised domain adaptation (UDA). Instead, we rely on stochastic translation to capture inherent translation ambiguities. This allows us to (i) train more accurate target networks by generating multiple outputs conditioned on the same source image, leveraging both accurate translation and data augmentation for appearance variability, (ii) impute robust pseudo-labels for the target data by averaging the predictions of a source network on multiple translated versions of a single target image and (iii) train and ensemble diverse networks in the target domain by modulating the degree of stochasticity in the translations. We report improvements over strong recent baselines, leading to state-of-the-art UDA results on two challenging semantic segmentation benchmarks.
翻译:在这项工作中,我们质疑在未受监督的域适应(UDA)中使用源和目标域之间的一对一绘图(“翻译”)这一共同方法。相反,我们依靠随机翻译来捕捉固有的翻译模糊性。这使我们能够(一) 培训更准确的目标网络,产生以同一源图像为条件的多种产出,同时利用准确翻译和数据增强来影响外观变异性;(二) 通过在单一目标图像的多种翻译版本中平均对源网络的预测,对目标数据进行可靠伪标签。 (三) 通过调整翻译的随机性程度,对目标域的多种网络进行培训和组合。