Unsupervised domain adaptation (UDA) enables cross-domain learning without target domain labels by transferring knowledge from a labeled source domain whose distribution differs from that of the target. However, UDA is not always successful and several accounts of `negative transfer' have been reported in the literature. In this work, we prove a simple lower bound on the target domain error that complements the existing upper bound. Our bound shows the insufficiency of minimizing source domain error and marginal distribution mismatch for a guaranteed reduction in the target domain error, due to the possible increase of induced labeling function mismatch. This insufficiency is further illustrated through simple distributions for which the same UDA approach succeeds, fails, and may succeed or fail with an equal chance. Motivated from this, we propose novel data poisoning attacks to fool UDA methods into learning representations that produce large target domain errors. We evaluate the effect of these attacks on popular UDA methods using benchmark datasets where they have been previously shown to be successful. Our results show that poisoning can significantly decrease the target domain accuracy, dropping it to almost 0% in some cases, with the addition of only 10% poisoned data in the source domain. The failure of these UDA methods demonstrates their limitations at guaranteeing cross-domain generalization consistent with our lower bound. Thus, evaluating UDA methods in adversarial settings such as data poisoning provides a better sense of their robustness to data distributions unfavorable for UDA.
翻译:不受监督的域适应(UDA)通过从分布与目标分布不同、但分布与目标分布不同的标签源域域域域域域传来知识,使跨部学习没有目标域名标签的跨部学习,从而能够通过从一个标签的源域域域中转让知识,进行没有目标域名标签的跨部学习。然而,UDA并不总是总是成功的,文献中也报告了一些“负转移”的账户。在这项工作中,我们证明在目标域错误上,在补充现有的上界域,我们对目标域错误有简单的较低约束。我们的界限显示,由于诱导标签功能错配错的可能增加,将源源域域域差错误和边际分配错以保证减少目标域错误的保证减少,这些不足通过简单分布来进一步说明。由于同一个UDA办法的成功、失败、失败、可能以同样的机会成功、成功或失败的简单,UDA,我们提议新的数据中毒攻击以欺骗UDA的方法学习产生大目标域错误。我们用基准数据集评估这些攻击对大众UDA方法的影响。我们用以前证明成功的地方,我们的结果表明,中毒可以大大降低目标域域域的域准确降低目标域准确性可以大大降低,中毒的污染域的污染域准确性可以大幅度降低,有时税域准确性可以将一些数据源源源中的数据方法的又失败失败方法的这些方法对流行方法对流行方法对流行方法。这些方法的失败。