Domain Adaptation (DA) aims to leverage the knowledge learned from a source domain with ample labeled data to a target domain with unlabeled data only. Most existing studies on DA contribute to learning domain-invariant feature representations for both domains by minimizing the domain gap based on convolution-based neural networks. Recently, vision transformers significantly improved performance in multiple vision tasks. Built on vision transformers, in this paper we propose a Bidirectional Cross-Attention Transformer (BCAT) for DA with the aim to improve the performance. In the proposed BCAT, the attention mechanism can extract implicit source and target mixup feature representations to narrow the domain discrepancy. Specifically, in BCAT, we design a weight-sharing quadruple-branch transformer with a bidirectional cross-attention mechanism to learn domain-invariant feature representations. Extensive experiments demonstrate that the proposed BCAT model achieves superior performance on four benchmark datasets over existing state-of-the-art DA methods that are based on convolutions or transformers.
翻译:域适应(DA)旨在利用从来源领域获得的知识,该源领域有大量标签数据,该源领域只有无标签数据,大多数关于DA的现有研究都有助于学习这两个领域的域-域-变异特征表示方式,其方法是尽量减少以神经神经网络为基础的域间差距;最近,视觉变压器大大改进了多重愿景任务的业绩;在视觉变压器上,我们在本文件中提议为DA建立一个双向交叉感应变换器(BCAT),目的是改进性能;在拟议的BCAT中,注意机制可以提取隐含源和目标混合特征表示方式,以缩小域差异;具体地说,在BCAT中,我们设计了一个重分权四管变压变换器,采用双向交叉感应机制,以学习域-变换特征表示方式;广泛的实验表明,拟议的BCAT模型在四个基准数据集上取得了优异变率,这四个基准数据集是建立在变压器或变压器基础上的。