The success of supervised learning hinges on the assumption that the training and test data come from the same underlying distribution, which is often not valid in practice due to potential distribution shift. In light of this, most existing methods for unsupervised domain adaptation focus on achieving domain-invariant representations and small source domain error. However, recent works have shown that this is not sufficient to guarantee good generalization on the target domain, and in fact, is provably detrimental under label distribution shift. Furthermore, in many real-world applications it is often feasible to obtain a small amount of labeled data from the target domain and use them to facilitate model training with source data. Inspired by the above observations, in this paper we propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA). First, we provide a finite sample bound for both classification and regression problems under Semi-DA. The bound suggests a principled way to obtain target generalization, i.e. by aligning both the marginal and conditional distributions across domains in feature space. Motivated by this, we then introduce the LIRR algorithm for jointly \textbf{L}earning \textbf{I}nvariant \textbf{R}epresentations and \textbf{R}isks. Finally, extensive experiments are conducted on both classification and regression tasks, which demonstrates LIRR consistently achieves state-of-the-art performance and significant improvements compared with the methods that only learn invariant representations or invariant risks.
翻译:受监督学习的成功取决于以下假设:培训和测试数据来自相同的基本分布,由于潜在的分布变化,这些数据在实际中往往不起作用。有鉴于此,大多数现有的不受监督的域适应方法侧重于实现域差异表和小源域错误。然而,最近的工作表明,这不足以保证目标领域得到很好的概括化,事实上,在标签分布变化下,这确实有害。此外,在许多现实世界应用中,从目标领域获得少量的标签改进数据,并用这些数据促进源数据示范培训,往往不可行。根据上述观察,我们建议了第一个旨在同时学习半监督域适应(Semi-DA)设置下的异差表示和风险的方法。首先,我们提供了一个固定的样本,在目标分布方面都有一定的概括化。 也就是说,从目标空间的不同领域获得少量的标签和有条件的分布,从而便利于源数据的培训培训。根据上述观察,我们随后提出了第一个方法,目的是在半监督领域适应的域适应(Semi-DA) 和持续的LIFLRLLL 和LTexliflixlal 中,我们先行学习了持续的缩算。