Unsupervised domain adaptation (UDA) aims to transfer knowledge learned from a labeled source domain to a different unlabeled target domain. Most existing UDA methods focus on learning domain-invariant feature representation, either from the domain level or category level, using convolution neural networks (CNNs)-based frameworks. One fundamental problem for the category level based UDA is the production of pseudo labels for samples in target domain, which are usually too noisy for accurate domain alignment, inevitably compromising the UDA performance. With the success of Transformer in various tasks, we find that the cross-attention in Transformer is robust to the noisy input pairs for better feature alignment, thus in this paper Transformer is adopted for the challenging UDA task. Specifically, to generate accurate input pairs, we design a two-way center-aware labeling algorithm to produce pseudo labels for target samples. Along with the pseudo labels, a weight-sharing triple-branch transformer framework is proposed to apply self-attention and cross-attention for source/target feature learning and source-target domain alignment, respectively. Such design explicitly enforces the framework to learn discriminative domain-specific and domain-invariant representations simultaneously. The proposed method is dubbed CDTrans (cross-domain transformer), and it provides one of the first attempts to solve UDA tasks with a pure transformer solution. Extensive experiments show that our proposed method achieves the best performance on Office-Home, VisDA-2017, and DomainNet datasets.
翻译:不受监督的域适应(UDA)旨在将从标签源域向不同的未标签目标域转让所学知识。大多数现有的UDA方法侧重于学习域-异性特征代表,从域级或类别级学习域-异性特征代表,使用共振神经网络(CNNs)基于框架。基于类别级的UDA的一个基本问题是为目标域的样本制作假标签,这些标签通常过于吵杂,无法准确的域对UDA的性能调整。随着变换器在各种任务中的成功,我们发现变换器的交叉注意对于噪音输入配对更好的功能调整是强大的,因此在本纸张变换器中采用对域-异性特征代表的学习,无论是从域一级还是类别一级,都采用。具体来说,我们设计了双向中心-感知的标签算法算出一个双向目标样品的假标签。除了假标签之外,还提议一个权重共享的三权基变式变式变换框架用于源/目标特性学习和源-目标域域对等的调,因此,这种设计可以明确地将最佳的域图式格式格式化框架用于学习CD的域图式变换。