Leveraging labelled data from multiple domains to enable prediction in another domain without labels is a significant, yet challenging problem. To address this problem, we introduce the framework DAPDAG (\textbf{D}omain \textbf{A}daptation via \textbf{P}erturbed \textbf{DAG} Reconstruction) and propose to learn an auto-encoder that undertakes inference on population statistics given features and reconstructing a directed acyclic graph (DAG) as an auxiliary task. The underlying DAG structure is assumed invariant among observed variables whose conditional distributions are allowed to vary across domains led by a latent environmental variable $E$. The encoder is designed to serve as an inference device on $E$ while the decoder reconstructs each observed variable conditioned on its graphical parents in the DAG and the inferred $E$. We train the encoder and decoder jointly in an end-to-end manner and conduct experiments on synthetic and real datasets with mixed variables. Empirical results demonstrate that reconstructing the DAG benefits the approximate inference. Furthermore, our approach can achieve competitive performance against other benchmarks in prediction tasks, with better adaptation ability, especially in the target domain significantly different from the source domains.
翻译:利用多个域的标签数据在另一个域中进行无标签的预测是一个重大但具有挑战性的问题。 为了解决这个问题, 我们引入了 DADADAG (\ textbf{D}omain\ textbf{A}通过\ textbf{P}erturbed\ textbf{DAG}重建) 框架框架, 并提议学习一个自动编码器, 该自动编码器对人口统计特性进行推断, 并将定向循环图( DAG) 作为一项辅助任务进行重建。 DAG 基本结构被假设为各种观察到的变量的变异性, 这些变量的有条件分布允许在由潜在环境变量 $E 主导的不同领域有所不同。 编码器的设计是作为美元 的推断装置, 而解码器则重建每个观察到的变量条件, 其图形父母在 DAG 和推算值 $E 。 我们以端到端的方式联合培训编码器和解析器, 并用混合变量对合成和真实数据集进行实验。 Epicalalalal 的结果表明, 重建DAG 目标域的预测方法比其他领域更具有更高的性, 。