In semi-supervised domain adaptation, a few labeled samples per class in the target domain guide features of the remaining target samples to aggregate around them. However, the trained model cannot produce a highly discriminative feature representation for the target domain because the training data is dominated by labeled samples from the source domain. This could lead to disconnection between the labeled and unlabeled target samples as well as misalignment between unlabeled target samples and the source domain. In this paper, we propose a novel approach called Cross-domain Adaptive Clustering to address this problem. To achieve both inter-domain and intra-domain adaptation, we first introduce an adversarial adaptive clustering loss to group features of unlabeled target data into clusters and perform cluster-wise feature alignment across the source and target domains. We further apply pseudo labeling to unlabeled samples in the target domain and retain pseudo-labels with high confidence. Pseudo labeling expands the number of ``labeled" samples in each class in the target domain, and thus produces a more robust and powerful cluster core for each class to facilitate adversarial learning. Extensive experiments on benchmark datasets, including DomainNet, Office-Home and Office, demonstrate that our proposed approach achieves the state-of-the-art performance in semi-supervised domain adaptation.
翻译:在半监督的域适应中,目标领域指导剩余目标样品的每个类别都有少量标签样本,以便围绕目标领域集成。然而,经过培训的模型不能产生目标领域高度歧视的特征代表,因为培训数据由来源领域的标签样本主导。这可能导致标签和未标签目标样品之间脱钩,以及未标签目标样品和源领域之间的错配。在本文件中,我们提议了一种新颖的方法,称为跨部调整组合,以解决这一问题。为了实现内部和部内适应性调整,我们首先对未标签目标数据的组群特征进行对抗性适应性分组损失,将其纳入集群,并在源域和目标领域进行分组性特征调整。我们进一步对目标领域无标签目标样品和未标签目标样品以及未标签目标样品和源领域之间的假标签使用假标签进行假贴标签。在目标领域每个类别中扩大“标签”样品的数量,从而为每个类别制作一个更强大和强大的集群核心,以便利敌对性学习。在基准领域,我们的拟议实地实验中,包括:在基准域内实现基准数据调整,在基准域内实现。