Unsupervised Domain Adaptation (UDA) aims to bridge the gap between a source domain, where labelled data are available, and a target domain only represented with unlabelled data. If domain invariant representations have dramatically improved the adaptability of models, to guarantee their good transferability remains a challenging problem. This paper addresses this problem by using active learning to annotate a small budget of target data. Although this setup, called Active Domain Adaptation (ADA), deviates from UDA's standard setup, a wide range of practical applications are faced with this situation. To this purpose, we introduce \textit{Stochastic Adversarial Gradient Embedding} (SAGE), a framework that makes a triple contribution to ADA. First, we select for annotation target samples that are likely to improve the representations' transferability by measuring the variation, before and after annotation, of the transferability loss gradient. Second, we increase sampling diversity by promoting different gradient directions. Third, we introduce a novel training procedure for actively incorporating target samples when learning invariant representations. SAGE is based on solid theoretical ground and validated on various UDA benchmarks against several baselines. Our empirical investigation demonstrates that SAGE takes the best of uncertainty \textit{vs} diversity samplings and improves representations transferability substantially.
翻译:不受监督的域适应(UDA) 旨在弥合源域(有贴标签的数据)与目标域(只有未贴标签的数据)之间的差距。 如果域变量表显著改善了模型的可调适性,那么确保模型的良好可调适性仍是一个具有挑战性的问题。本文件通过积极学习对目标数据略作说明来解决这个问题。虽然这个称为主动域适应(ADA)的设置偏离了UDA的标准设置,但面临这种情况的广泛实际应用。为此目的,我们引入了\textit{Stochatic Adversarial Greative Embediding}(SAGE)这一框架,为ADA作出三重贡献。首先,我们选择了说明性标本样本,有可能通过测量可转移性损失梯度的变异性来改进其可转移性。第二,我们通过推广不同的梯度方向来增加抽样多样性。第三,我们引入了一种新型的培训程序,以便在学习变量表时积极纳入目标样本。SAGAGEA基于坚实的理论基础,并根据各种铀多样性基准进行最佳的转让性研究。