Standard unsupervised domain adaptation methods adapt models from a source to a target domain using labeled source data and unlabeled target data jointly. In model adaptation, on the other hand, access to the labeled source data is prohibited, i.e., only the source-trained model and unlabeled target data are available. We investigate normal-to-adverse condition model adaptation for semantic segmentation, whereby image-level correspondences are available in the target domain. The target set consists of unlabeled pairs of adverse- and normal-condition street images taken at GPS-matched locations. Our method -- CMA -- leverages such image pairs to learn condition-invariant features via contrastive learning. In particular, CMA encourages features in the embedding space to be grouped according to their condition-invariant semantic content and not according to the condition under which respective inputs are captured. To obtain accurate cross-domain semantic correspondences, we warp the normal image to the viewpoint of the adverse image and leverage warp-confidence scores to create robust, aggregated features. With this approach, we achieve state-of-the-art semantic segmentation performance for model adaptation on several normal-to-adverse adaptation benchmarks, such as ACDC and Dark Zurich. We also evaluate CMA on a newly procured adverse-condition generalization benchmark and report favorable results compared to standard unsupervised domain adaptation methods, despite the comparative handicap of CMA due to source data inaccessibility. Code is available at https://github.com/brdav/cma.
翻译:标准且不受监督的域适应方法使模型从源到目标域,使用标签源数据和未贴标签的目标数据,将模型从源到目标域。在模型调整中,则禁止访问标签源数据,即仅提供源培训模型和未贴标签的目标数据。我们调查用于语义分解的正常到不利条件模型,从而在目标域内提供图像级对应信息。目标集包括未贴标签的两对在GPS相匹配地点拍摄的不利和正常劣质街道图像。我们的方法 -- -- CMA -- -- 利用这些图像配方通过对比学习学习来学习条件变异性特征。特别是,CMA鼓励根据嵌入空间的特性按其条件变异性静态内容进行分组,而不是按照各自输入的接收条件进行分组。要获取准确的跨部域间静态对应对应通信,我们将正常图像转换为不利的源图像,并利用可粘贴的备查分数创建稳健的、非综合的功能。我们通过这一方法,在常规的域域域域校准标准缩缩缩缩缩(W-CD)数据缩略地对标准段进行评估。</s>