Successful unsupervised domain adaptation (UDA) is guaranteed only under strong assumptions such as covariate shift and overlap between input domains. The latter is often violated in high-dimensional applications such as image classification which, despite this challenge, continues to serve as inspiration and benchmark for algorithm development. In this work, we show that access to side information about examples from the source and target domains can help relax these assumptions and increase sample efficiency in learning, at the cost of collecting a richer variable set. We call this domain adaptation by learning using privileged information (DALUPI). Tailored for this task, we propose a simple two-stage learning algorithm inspired by our analysis and a practical end-to-end algorithm for multi-label image classification. In a suite of experiments, including an application to medical image analysis, we demonstrate that incorporating privileged information in learning can reduce errors in domain transfer compared to classical learning.
翻译:成功的无监督域自适应学习 (UDA) 只有在诸如协变量漂移和输入域重叠等强假设下才有保障。但在高维应用中(例如图像分类),后者常常被违反,尽管遇到这种挑战,图像分类仍然作为算法开发的灵感和基准。在这项研究中,我们展示了访问与源和目标域中例子相关的副信息可以有助于放宽这些假设,并在学习中增加采样效率,而代价是收集一个更丰富的变量集。我们将此称为“使用特权信息学习的域适应” (DALUPI)。为此任务量身定制,我们提出了一种简单的两阶段学习算法,受我们的分析启发,并提出了一种用于多标签图像分类的实用端到端算法。在一系列实验中,包括医学图像分析的应用,我们证明了将特权信息纳入学习中可以将域转移中的错误减少与经典学习相比。