Together with the recent advances in semantic segmentation, many domain adaptation methods have been proposed to overcome the domain gap between training and deployment environments. However, most previous studies use limited combinations of source/target datasets, and domain adaptation techniques have never been thoroughly evaluated in a more challenging and diverse set of target domains. This work presents a new multi-domain dataset DRIV100 for benchmarking domain adaptation techniques on in-the-wild road-scene videos collected from the Internet. The dataset consists of pixel-level annotations for 100 videos selected to cover diverse scenes/domains based on two criteria; human subjective judgment and an anomaly score judged using an existing road-scene dataset. We provide multiple manually labeled ground-truth frames for each video, enabling a thorough evaluation of video-level domain adaptation where each video independently serves as the target domain. Using the dataset, we quantify domain adaptation performances of state-of-the-art methods and clarify the potential and novel challenges of domain adaptation techniques. The dataset is available at https://doi.org/10.5281/zenodo.4389243.
翻译:与最近在语义分割方面取得的进步一起,提出了许多领域适应方法,以克服培训和部署环境之间的领域差距,然而,大多数先前的研究都使用了源/目标数据集的有限组合,而且从未在更具有挑战性和多样性的一组目标领域对域适应技术进行彻底评估。这项工作提出了一套新的多域数据集DRIV100,用于在互联网上收集的光化道路-线性视频上对域适应技术进行基准。数据集包括100个视频的像素级说明,这些视频是根据以下两个标准选择的:人类主观判断和利用现有道路-线性数据集判断的异常分数。我们为每个视频提供了多部手动标记的地面图例框架,以便能够对每个视频独立作为目标领域的视频一级适应技术进行彻底评估。我们利用数据集量化了国家-线性方法的域适应性表现,并澄清了域适应技术的潜在和新挑战。数据集见https://doi.org/10.5281/zenodo.43243。