The recently developed transformer networks have achieved impressive performance in image denoising by exploiting the self-attention (SA) in images. However, the existing methods mostly use a relatively small window to compute SA due to the quadratic complexity of it, which limits the model's ability to model long-term image information. In this paper, we propose the spatial-frequency attention network (SFANet) to enhance the network's ability in exploiting long-range dependency. For spatial attention module (SAM), we adopt dilated SA to model long-range dependency. In the frequency attention module (FAM), we exploit more global information by using Fast Fourier Transform (FFT) by designing a window-based frequency channel attention (WFCA) block to effectively model deep frequency features and their dependencies. To make our module applicable to images of different sizes and keep the model consistency between training and inference, we apply window-based FFT with a set of fixed window sizes. In addition, channel attention is computed on both real and imaginary parts of the Fourier spectrum, which further improves restoration performance. The proposed WFCA block can effectively model image long-range dependency with acceptable complexity. Experiments on multiple denoising benchmarks demonstrate the leading performance of SFANet network.
翻译:最近开发的变压器网络通过利用图像中的自我注意(SA)在图像脱色方面取得了令人印象深刻的性能;然而,现有方法大多使用相对较小的窗口来计算SA(SA),因为其四重复杂,限制了模型模拟长期图像信息的能力;在本文件中,我们提议空间频率关注网络(SFANet)来提高网络利用远程依赖性的能力;对于空间关注模块(SAM),我们采用将SA扩大至模型远程依赖性。在频率关注模块(FAM)中,我们通过使用快速Fourier变换(FFT)来利用更多的全球信息,方法是设计一个基于窗口的频率注意(FFT)块来有效地模拟远频特性及其依赖性。为了使我们的模块适用于不同尺寸的图像并保持培训与推断之间的模型一致性,我们应用基于窗口的FFFT(FT)来应用一套固定窗口尺寸的模型。此外,还计算出对四重频谱的真像部分的注意度和想象力,从而进一步提高恢复性能性能。拟议的FOSCA模型的可接受性硬度模型可以有效展示S-FA图像模型的模型。</s>