This paper is concerned with data-driven unsupervised domain adaptation, where it is unknown in advance how the joint distribution changes across domains, i.e., what factors or modules of the data distribution remain invariant or change across domains. To develop an automated way of domain adaptation with multiple source domains, we propose to use a graphical model as a compact way to encode the change property of the joint distribution, which can be learned from data, and then view domain adaptation as a problem of Bayesian inference on the graphical models. Such a graphical model distinguishes between constant and varied modules of the distribution and specifies the properties of the changes across domains, which serves as prior knowledge of the changing modules for the purpose of deriving the posterior of the target variable $Y$ in the target domain. This provides an end-to-end framework of domain adaptation, in which additional knowledge about how the joint distribution changes, if available, can be directly incorporated to improve the graphical representation. We discuss how causality-based domain adaptation can be put under this umbrella. Experimental results on both synthetic and real data demonstrate the efficacy of the proposed framework for domain adaptation. The code is available at https://github.com/mgong2/DA_Infer .
翻译:本文涉及由数据驱动的、不受监督的域适应,在这一点上,人们事先不知道在各领域之间联合分配的变化是如何变化的,即数据分配的哪些因素或模块仍然没有变化,或在不同领域之间变化。为了开发一种由多个源域自动进行的域适应方法,我们提议使用一个图形模型,作为缩略语,用于编码联合分发的属性的变化,从数据中可以学习,然后将域适应视为Bayesian推理图形模型的一个问题。这种图形模型区分了分布的常数和不同模块,并具体说明了跨域变化的特性,这些模型是用于在目标域内生成目标变量$Y$的改变模块的先前知识。这提供了一个域适应的端到端框架,其中提供了如何直接纳入联合分发变化的更多知识,以改进图形表达方式。我们讨论了如何将基于因果关系的域适应置于这个总括之下。关于合成和真实数据的实验结果显示了拟议区域适应框架的功效。