Invariant approaches have been remarkably successful in tackling the problem of domain generalization, where the objective is to perform inference on data distributions different from those used in training. In our work, we investigate whether it is possible to leverage domain information from the unseen test samples themselves. We propose a domain-adaptive approach consisting of two steps: a) we first learn a discriminative domain embedding from unsupervised training examples, and b) use this domain embedding as supplementary information to build a domain-adaptive model, that takes both the input as well as its domain into account while making predictions. For unseen domains, our method simply uses few unlabelled test examples to construct the domain embedding. This enables adaptive classification on any unseen domain. Our approach achieves state-of-the-art performance on various domain generalization benchmarks. In addition, we introduce the first real-world, large-scale domain generalization benchmark, Geo-YFCC, containing 1.1M samples over 40 training, 7 validation, and 15 test domains, orders of magnitude larger than prior work. We show that the existing approaches either do not scale to this dataset or underperform compared to the simple baseline of training a model on the union of data from all training domains. In contrast, our approach achieves a significant improvement.
翻译:在解决广域化问题方面,各种办法都取得了显著的成功。 我们的目标是对不同于培训中所用数据分布进行推断。 我们在工作中调查是否有可能利用从无形测试样品本身获得的域信息。 我们提出一个域适应方法,包括两个步骤:(a) 我们首先从未经监督的培训实例中学习一个歧视性领域,然后通过未经监督的培训实例;(b) 将域嵌入作为补充信息,以构建一个域适应模型,既考虑输入,也考虑其域,同时作出预测。对于无形领域,我们的方法只是使用很少的未加标签的试验示例来构建域嵌入。这样,我们的方法就能够在任何看不见的域进行适应性分类。我们的方法在各种域通用基准上达到最新业绩。此外,我们引入了第一个真实世界、大型域通用基准,Geo-YFCC,其中包含40多个培训样本,7个验证,15个测试领域,规模大于先前的工作。我们指出,现有的方法不是从任何未标定的域内,而是从一个显著的模型到这个培训领域,而是从一个简单的基线上,我们所有培训领域的对比了。