Multi-source domain adaptation (MSDA) learns to predict the labels in target domain data, under the setting where all data from multiple source domains are labelled and the data from the target domain are unlabeled. To handle this problem, most of methods focus on learning invariant representations across domains. However, their success severely relies on the assumption that label distribution remains unchanged across domains. To mitigate it, we propose a new assumption, latent covariate shift, where the marginal distribution of the latent content variable changes across domains, and the conditional distribution of the label given the latent content remains invariant across domains. We introduce a latent style variable to complement the latent content variable forming a latent causal graph as the data and label generating process. We show that although the latent style variable is unidentifiable due to transitivity property in the latent space, the latent content variable can be identified up to simple scaling under some mild conditions. This motivates us to propose a novel method for MSDA, which learns the invariant label distribution conditional on the latent content variable, instead of learning invariant representations. Empirical evaluation on simulation and real data demonstrates the effectiveness of the proposed method, compared with many state-of-the-art methods based on invariant representation.
翻译:多源域适应(MSDA) 学会预测目标域数据中的标签, 在设置时多源域域的所有数据都贴上标签, 目标域的数据没有标签。 要处理这一问题, 大部分方法侧重于学习跨域的异样表示。 但是, 它们的成功严重依赖于标签分布在各域之间保持不变的假设。 为了减轻这一变化, 我们提议一个新的假设, 潜在的内容变量变化在跨域的边际分布, 以及标签的有条件分布, 因为潜在内容的分布仍然处于跨域的变异状态。 我们引入了一个潜在样式变量, 以补充潜在内容变量, 形成潜在因果图作为数据和标签生成过程。 我们显示, 虽然潜在样式变量因潜在空间的中转属性而不可识别, 但潜在内容变量可以在一些温和的条件下被确定为简单的缩放。 这促使我们提出一个新的MSDADA 方法, 以潜在内容变量为条件学习不变的标签分布, 而不是学习变量。 模拟和真实数据评估显示了基于许多状态表达方式的拟议方法的有效性。