In recent years, several efforts have been aimed at improving the robustness of vision models to domains and environments unseen during training. An important practical problem pertains to models deployed in a new geography that is under-represented in the training dataset, posing a direct challenge to fair and inclusive computer vision. In this paper, we study the problem of geographic robustness and make three main contributions. First, we introduce a large-scale dataset GeoNet for geographic adaptation containing benchmarks across diverse tasks like scene recognition (GeoPlaces), image classification (GeoImNet) and universal adaptation (GeoUniDA). Second, we investigate the nature of distribution shifts typical to the problem of geographic adaptation and hypothesize that the major source of domain shifts arise from significant variations in scene context (context shift), object design (design shift) and label distribution (prior shift) across geographies. Third, we conduct an extensive evaluation of several state-of-the-art unsupervised domain adaptation algorithms and architectures on GeoNet, showing that they do not suffice for geographical adaptation, and that large-scale pre-training using large vision models also does not lead to geographic robustness. Our dataset is publicly available at https://tarun005.github.io/GeoNet.
翻译:----
近年来,有多项工作旨在提高视觉模型对于训练期间未涵盖的领域和环境的鲁棒性。一个重要的实际问题是,在一个训练数据集中未充分表示的新地理环境中部署模型,这直接挑战了公平和包容性计算机视觉。本文研究了地理鲁棒性问题,并做出了三个主要贡献。首先,我们介绍了一个大规模的地理适应数据集GeoNet,其中包含不同任务的基准测试,如场景识别(GeoPlaces)、图像分类(GeoImNet)和通用适应(GeoUniDA)。其次,我们调查了地理适应问题的分布变化的性质,并假设主要的领域变化来自于地理位置的情景背景(上下文变化)、物体设计(设计变化)和标签分布(先验变化)上的显著变化。第三,我们对GeoNet上的几种最先进的无监督领域自适应算法和架构进行了广泛的评估,结果表明它们不足以实现地理适应,并且使用大型视觉模型的大规模预训练也不能实现地理鲁棒性。我们的数据集可在https://tarun005.github.io/GeoNet公开获取。