Multi-source domain adaptation (MSDA) learns to predict the labels in target domain data, under the setting that data from multiple source domains are labelled and data from the target domain are unlabelled. Most methods for this task focus on learning invariant representations across domains. However, their success relies heavily on the assumption that the label distribution remains consistent across domains, which may not hold in general real-world problems. In this paper, we propose a new and more flexible assumption, termed \textit{latent covariate shift}, where a latent content variable $\mathbf{z}_c$ and a latent style variable $\mathbf{z}_s$ are introduced in the generative process, with the marginal distribution of $\mathbf{z}_c$ changing across domains and the conditional distribution of the label given $\mathbf{z}_c$ remaining invariant across domains. We show that although (completely) identifying the proposed latent causal model is challenging, the latent content variable can be identified up to scaling by using its dependence with labels from source domains, together with the identifiability conditions of nonlinear ICA. This motivates us to propose a novel method for MSDA, which learns the invariant label distribution conditional on the latent content variable, instead of learning invariant representations. Empirical evaluation on simulation and real data demonstrates the effectiveness of the proposed method.
翻译:多源域适应( MSDA) 学会预测目标域数据中的标签, 在来自多个源域的数据贴标签和来自目标域的数据没有标签的设置下, 多源域适应( MSDA) 学会了预测目标域数据中的标签, 在来自多个源域的数据被贴标签和来自目标域的数据被贴上标签的设置下, 大部分任务的方法都侧重于学习跨域的异样表示。 但是, 其成功在很大程度上取决于以下假设, 即标签的分布在跨域的分布仍然一致, 这在一般现实世界问题中可能不会存在。 在本文中, 我们提出了一个新的和更加灵活的假设假设假设, 称为\ textit{laent covarial compilate traft}, 潜在内容变量变量变量可以被识别到通过使用其来自源域的标签的依赖度 $\mathbbf{z{z{ {z{ { z\ $ $ $ $ $ 用于基因化过程, 在域进程上边际的边际分配方式上, 学习了 IMFICA 的可变化方法,, 学习了我们的可变化方法的可变性 的可变性 。