In order to robustly deploy object detectors across a wide range of scenarios, they should be adaptable to shifts in the input distribution without the need to constantly annotate new data. This has motivated research in Unsupervised Domain Adaptation (UDA) algorithms for detection. UDA methods learn to adapt from labeled source domains to unlabeled target domains, by inducing alignment between detector features from source and target domains. Yet, there is no consensus on what features to align and how to do the alignment. In our work, we propose a framework that generalizes the different components commonly used by UDA methods laying the ground for an in-depth analysis of the UDA design space. Specifically, we propose a novel UDA algorithm, ViSGA, a direct implementation of our framework, that leverages the best design choices and introduces a simple but effective method to aggregate features at instance-level based on visual similarity before inducing group alignment via adversarial training. We show that both similarity-based grouping and adversarial training allows our model to focus on coarsely aligning feature groups, without being forced to match all instances across loosely aligned domains. Finally, we examine the applicability of ViSGA to the setting where labeled data are gathered from different sources. Experiments show that not only our method outperforms previous single-source approaches on Sim2Real and Adverse Weather, but also generalizes well to the multi-source setting.
翻译:为了在各种假设情景中强有力地部署物体探测器,它们应该适应投入分布的变化,而不必不断说明新数据。这推动了对无监督的域适应(UDA)检测算法的研究。UDA方法学会从标签源域向未标签目标域的调整,办法是促使源域和目标域的探测器特征之间保持一致。然而,对于什么特征可以协调以及如何调整,还没有达成共识。在我们的工作中,我们建议了一个框架,将UDA方法通常使用的不同组成部分加以概括,为深入分析UDA设计空间奠定基础。具体地说,我们提出了新型UDA算法,VISGA,直接执行我们的框架,利用最佳设计选择,并在通过对抗性培训引导群体调整之前,根据视觉相似性,引入一个简单而有效的方法,在实例一级综合特征。我们发现,基于相似性的分组和对抗性培训使我们的模型能够侧重于对特征组进行精确的组合,同时不被迫将所有实例都匹配到不完全一致的域域内的所有实例。最后,我们提出了一个新的UDA算法,即直接执行我们的框架,即利用最佳设计选择,我们以前的实验源码系统,我们以前的系统将只能用来建立不同的实验系统。