Optimizing the performance of classifiers on samples from unseen domains remains a challenging problem. While most existing studies on domain generalization focus on learning domain-invariant feature representations, multi-expert frameworks have been proposed as a possible solution and have demonstrated promising performance. However, current multi-expert learning frameworks fail to fully exploit source domain knowledge during inference, resulting in sub-optimal performance. In this work, we propose to adapt Transformers for the purpose of dynamically decoding source domain knowledge for domain generalization. Specifically, we build one domain-specific local expert per source domain and one domain-agnostic feature branch as query. A Transformer encoder encodes all domain-specific features as source domain knowledge in memory. In the Transformer decoder, the domain-agnostic query interacts with the memory in the cross-attention module, and domains that are similar to the input will contribute more to the attention output. Thus, source domain knowledge gets dynamically decoded for inference of the current input from unseen domain. This mechanism enables the proposed method to generalize well to unseen domains. The proposed method has been evaluated on three benchmarks in the domain generalization field and shown to have the best performance compared to state-of-the-art methods.
翻译:优化对隐蔽域样本的分类员的性能仍是一个具有挑战性的问题。虽然大多数关于领域一般化的现有研究都侧重于学习域内差异特征的表示方式,但多专家框架已经作为一种可能的解决方案提出,并表现出有希望的绩效。然而,目前的多专家学习框架在推论期间未能充分利用源域知识,从而导致次优性性性能。在这项工作中,我们提议为动态解码源域知识的目的调整变换器,以动态解码源域知识,供域内一般化使用。具体地说,我们为源域建立一个特定域专家,为查询而建立一个域内特性分支。一个变换器编码器将所有特定域特性编码为记忆中的源域知识。在变换器解码器中,域内异性查询与交叉注意单元的记忆相互作用,而与投入类似的领域将更有助于注意输出。因此,源域内知识被动态解码,以便从无形域内现有输入的推断出一个特定域。这一机制使得拟议的方法能够向隐蔽域进行广泛化。在三个域域域内比较了最佳的业绩方法。在比较了一般域内显示。