By exploiting large kernel decomposition and attention mechanisms, convolutional neural networks (CNN) can compete with transformer-based methods in many high-level computer vision tasks. However, due to the advantage of long-range modeling, the transformers with self-attention still dominate the low-level vision, including the super-resolution task. In this paper, we propose a CNN-based multi-scale attention network (MAN), which consists of multi-scale large kernel attention (MLKA) and a gated spatial attention unit (GSAU), to improve the performance of convolutional SR networks. Within our MLKA, we rectify LKA with multi-scale and gate schemes to obtain the abundant attention map at various granularity levels, therefore jointly aggregating global and local information and avoiding the potential blocking artifacts. In GSAU, we integrate gate mechanism and spatial attention to remove the unnecessary linear layer and aggregate informative spatial context. To confirm the effectiveness of our designs, we evaluate MAN with multiple complexities by simply stacking different numbers of MLKA and GSAU. Experimental results illustrate that our MAN can achieve varied trade-offs between state-of-the-art performance and computations. Code is available at https://github.com/icandle/MAN.
翻译:通过利用大型内核分解和关注机制,进化神经网络(CNN)可以在许多高级计算机愿景任务中与以变压器为基础的方法竞争,然而,由于长程模型的好处,自我关注的变压器仍然主导着低层次的视野,包括超分辨率任务。在本文中,我们提议建立一个以CNN为基础的多规模关注网络(MAN),由多规模的大型内核关注(MLKA)和一个封闭的空间关注单位(GSAU)组成,以改善共变式SR网络的性能。在MLKA中,我们用多种规模和门式计划对LKA进行纠正,以获得不同颗粒级层次的大量关注图,从而联合汇集全球和地方信息,避免潜在的阻断文物。在SGSAU,我们整合门机制和空间关注,以消除不必要的线性层和综合信息空间环境。为了证实我们的设计效果,我们用多种复杂性评价人马(MLKA)和GSAU。实验结果表明,我们的MAN可以在不同的数量上实现不同的交易/MAN的计算。