Distribution shift between train (source) and test (target) datasets is a common problem encountered in machine learning applications. One approach to resolve this issue is to use the Unsupervised Domain Adaptation (UDA) technique that carries out knowledge transfer from a label-rich source domain to an unlabeled target domain. Outliers that exist in either source or target datasets can introduce additional challenges when using UDA in practice. In this paper, $\alpha$-divergence is used as a measure to minimize the discrepancy between the source and target distributions while inheriting robustness, adjustable with a single parameter $\alpha$, as the prominent feature of this measure. Here, it is shown that the other well-known divergence-based UDA techniques can be derived as special cases of the proposed method. Furthermore, a theoretical upper bound is derived for the loss in the target domain in terms of the source loss and the initial $\alpha$-divergence between the two domains. The robustness of the proposed method is validated through testing on several benchmarked datasets in open-set and partial UDA setups where extra classes existing in target and source datasets are considered as outliers.
翻译:火车(源)和测试(目标)数据集之间的分布变化是机器学习应用中遇到的一个常见问题。解决这一问题的一种方法是使用无人监督的域适应技术,从标签丰富的源域向未贴标签的目标域转移知识。源数据集或目标数据集中存在的外部用户在实际使用UDA时可能会带来额外的挑战。在本文中,以$/alpha$-digence作为措施,以最大限度地缩小来源和目标分布之间的差异,同时继承稳健性(可用单参数$/alpha$调整)作为这一计量的突出特征。此处显示,其他众所周知的基于差异的UDA技术可以作为拟议方法的特例加以推导。此外,从源损失和两个域之间最初的美元/alpha$-digence的角度计算目标域损失的理论上限。拟议方法的稳健性通过测试在开放设置和部分UDA设置中的若干基准数据集,其中将现有目标源外的类别视为目标数据集。