Generalization of machine learning models trained on a set of source domains on unseen target domains with different statistics, is a challenging problem. While many approaches have been proposed to solve this problem, they only utilize source data during training but do not take advantage of the fact that a single target example is available at the time of inference. Motivated by this, we propose a method that effectively uses the target sample during inference beyond mere classification. Our method has three components - (i) A label-preserving feature or metric transformation on source data such that the source samples are clustered in accordance with their class irrespective of their domain (ii) A generative model trained on the these features (iii) A label-preserving projection of the target point on the source-feature manifold during inference via solving an optimization problem on the input space of the generative model using the learned metric. Finally, the projected target is used in the classifier. Since the projected target feature comes from the source manifold and has the same label as the real target by design, the classifier is expected to perform better on it than the true target. We demonstrate that our method outperforms the state-of-the-art Domain Generalization methods on multiple datasets and tasks.
翻译:在一组有不同统计数据的无形目标领域对一组源域进行培训的机器学习模型的普及是一个具有挑战性的问题。虽然提出了许多方法来解决这一问题,但它们在培训期间只使用源数据,而没有利用在推断时有一个单一的目标实例这一事实。受此驱动,我们提出了一个方法,在推断期间,不仅进行分类,而且有效地使用目标样本。我们的方法有三个组成部分:(一) 在源数据上保留标签特征,或根据源数据进行衡量转换,以便源样品按其类别分组,而不论其范围如何;(二) 就这些特征进行基因化模型的培训;(三) 在推断期间,通过使用所学的度量来解决基因模型输入空间的优化问题,对源-功能组合点上的目标点进行标签保留预测。最后,在分类中使用了预测目标。由于预测目标特征来自源数,并且与设计的实际目标有相同的标签,因此分类者预计将比真实目标更好地进行分类。我们的方法超越了一般数据化的多功能。我们证明,我们的方法超越了通用数据的状态。