Exemplar-based image translation refers to the task of generating images with the desired style, while conditioning on certain input image. Most of the current methods learn the correspondence between two input domains and lack the mining of information within the domains. In this paper, we propose a more general learning approach by considering two domain features as a whole and learning both inter-domain correspondence and intra-domain potential information interactions. Specifically, we propose a Cross-domain Feature Fusion Transformer (CFFT) to learn inter- and intra-domain feature fusion. Based on CFFT, the proposed CFFT-GAN works well on exemplar-based image translation. Moreover, CFFT-GAN is able to decouple and fuse features from multiple domains by cascading CFFT modules. We conduct rich quantitative and qualitative experiments on several image translation tasks, and the results demonstrate the superiority of our approach compared to state-of-the-art methods. Ablation studies show the importance of our proposed CFFT. Application experimental results reflect the potential of our method.
翻译:基于Exmplar的图像翻译是指以理想的风格生成图像的任务,同时以某些输入图像为条件。目前大多数方法学习两个输入领域之间的对应关系,而没有在域内挖掘信息。在本文件中,我们提出一个更全面的学习方法,将两个域特性作为一个整体来考虑,并学习多个域间通信和内部潜在信息互动。具体地说,我们建议使用一个跨域地貌融合变异器(CFFT)来学习不同域间和内部的聚合特征。根据CFFT,拟议的CFFT-GAN在基于实例的图像翻译方面运作良好。此外,CFFT-GAN能够通过CFFT模块从多个域分解和连接特征。我们在若干图像翻译任务上进行了丰富的定量和定性实验,结果显示了我们方法相对于最新方法的优越性。对比研究表明了我们提议的CFFT的重要性。应用实验结果反映了我们方法的潜力。