Unsupervised domain adaptation (UDA) aims to adapt a model trained on the source domain (e.g. synthetic data) to the target domain (e.g. real-world data) without requiring further annotations on the target domain. This work focuses on UDA for semantic segmentation as real-world pixel-wise annotations are particularly expensive to acquire. As UDA methods for semantic segmentation are usually GPU memory intensive, most previous methods operate only on downscaled images. We question this design as low-resolution predictions often fail to preserve fine details. The alternative of training with random crops of high-resolution images alleviates this problem but falls short in capturing long-range, domain-robust context information. Therefore, we propose HRDA, a multi-resolution training approach for UDA, that combines the strengths of small high-resolution crops to preserve fine segmentation details and large low-resolution crops to capture long-range context dependencies with a learned scale attention, while maintaining a manageable GPU memory footprint. HRDA enables adapting small objects and preserving fine segmentation details. It significantly improves the state-of-the-art performance by 5.5 mIoU for GTA-to-Cityscapes and 4.9 mIoU for Synthia-to-Cityscapes, resulting in unprecedented 73.8 and 65.8 mIoU, respectively. The implementation is available at https://github.com/lhoyer/HRDA.
翻译:未经监督的域适应(UDA)旨在将一个在源域(如合成数据)上经过培训的模型(如合成数据)调整到目标域(如真实世界数据),而无需对目标域作进一步说明。这项工作的重点是将语义分解的UDA作为真实世界像素的注释特别昂贵。由于UDA的语义分解方法通常是GPU内存密集的,大多数以前的方法都只对缩小尺度的图像起作用。我们质疑这一设计,因为低分辨率的预测往往无法保存详细细节。用高分辨率图像随机作物进行的培训可以缓解这一问题,但在获取远程、域-robust背景信息方面却做得不够。因此,我们建议HRDA,这是UDA的一种多分辨率培训方法,结合了小型高分辨率作物的优点,以保存精细分解细节和大型低分辨率作物来捕捉长距离的环境依赖性,同时保持可控的GPUPU记忆足迹。HRDA能够调整小物体,并保存精细的分解细节。在65-ALA-C-SAL-Simality上,从而改进了GSAL-C-SAL-C-SAL-C-SY-SAL-C-S-S-SAL-SYA-SY-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SAL-S-SY-S-S-S-S-S-S-S-S-S-SMA-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-I-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S