Recently, Transformer-based image restoration networks have achieved promising improvements over convolutional neural networks due to parameter-independent global interactions. To lower computational cost, existing works generally limit self-attention computation within non-overlapping windows. However, each group of tokens are always from a dense area of the image. This is considered as a dense attention strategy since the interactions of tokens are restrained in dense regions. Obviously, this strategy could result in restricted receptive fields. To address this issue, we propose Attention Retractable Transformer (ART) for image restoration, which presents both dense and sparse attention modules in the network. The sparse attention module allows tokens from sparse areas to interact and thus provides a wider receptive field. Furthermore, the alternating application of dense and sparse attention modules greatly enhances representation ability of Transformer while providing retractable attention on the input image.We conduct extensive experiments on image super-resolution, denoising, and JPEG compression artifact reduction tasks. Experimental results validate that our proposed ART outperforms state-of-the-art methods on various benchmark datasets both quantitatively and visually. We also provide code and models at the website https://github.com/gladzhang/ART.
翻译:最近,由于以参数独立的全球互动,基于变压器的图像恢复网络在革命神经网络中取得了大有希望的改善。为了降低计算成本,现有的工作一般限制在不重叠的窗口中进行自省计算。然而,每组符号总是来自图像的密集区域。这被视为一种密集的注意战略,因为在稠密地区,象征的相互作用受到限制。很明显,这一战略可能导致有限的可接受域。为了解决这一问题,我们建议关注可回溯变换器(ART)用于图像恢复,这既代表网络中密集的注意模块,也代表了少许的注意模块。微小的注意模块允许稀疏散地区的象征进行互动,从而提供了一个更广泛的可接受字段。此外,密集和稀散的注意模块的交替应用大大提高了变异器的代表性能力,同时对输入图像提供了可回的注意。我们在图像超分辨率、分解和JEG压缩工艺品的压缩任务方面进行了广泛的实验。实验结果证实,我们提议的ART超越了各种基准数据集的状态-艺术方法。我们还在网站 http://gimb/squblacom上提供了代码和模型。