The end-to-end image fusion framework has achieved promising performance, with dedicated convolutional networks aggregating the multi-modal local appearance. However, long-range dependencies are directly neglected in existing CNN fusion approaches, impeding balancing the entire image-level perception for complex scenario fusion. In this paper, therefore, we propose an infrared and visible image fusion algorithm based on a lightweight transformer module and adversarial learning. Inspired by the global interaction power, we use the transformer technique to learn the effective global fusion relations. In particular, shallow features extracted by CNN are interacted in the proposed transformer fusion module to refine the fusion relationship within the spatial scope and across channels simultaneously. Besides, adversarial learning is designed in the training process to improve the output discrimination via imposing competitive consistency from the inputs, reflecting the specific characteristics in infrared and visible images. The experimental performance demonstrates the effectiveness of the proposed modules, with superior improvement against the state-of-the-art, generalising a novel paradigm via transformer and adversarial learning in the fusion task.
翻译:端到端图像融合框架取得了有希望的绩效,通过专门的革命网络汇集当地多模式的外观,长期依赖性在现有的CNN聚合方法中被直接忽略,从而阻碍了对复杂情景融合的整个图像层面认识的平衡。因此,在本文件中,我们提出基于轻量变压器模块和对抗性学习的红外和可见图像融合算法。受全球互动力量的启发,我们利用变压器技术学习有效的全球融合关系。特别是,CNN所提取的浅色特征在拟议的变压器融合模块中互动,以完善空间范围内和跨渠道的融合关系。此外,在培训过程中设计对抗性学习,通过从投入中引入竞争一致性,反映红外和可见图像的具体特点,从而改进产出差异。实验性表现了拟议模块的有效性,与最新技术相比,我们通过在融合任务中变压器和对抗性学习,将新的范例推广为通用。