We present an effective and efficient method that explores the properties of Transformers in the frequency domain for high-quality image deblurring. Our method is motivated by the convolution theorem that the correlation or convolution of two signals in the spatial domain is equivalent to an element-wise product of them in the frequency domain. This inspires us to develop an efficient frequency domain-based self-attention solver (FSAS) to estimate the scaled dot-product attention by an element-wise product operation instead of the matrix multiplication in the spatial domain. In addition, we note that simply using the naive feed-forward network (FFN) in Transformers does not generate good deblurred results. To overcome this problem, we propose a simple yet effective discriminative frequency domain-based FFN (DFFN), where we introduce a gated mechanism in the FFN based on the Joint Photographic Experts Group (JPEG) compression algorithm to discriminatively determine which low- and high-frequency information of the features should be preserved for latent clear image restoration. We formulate the proposed FSAS and DFFN into an asymmetrical network based on an encoder and decoder architecture, where the FSAS is only used in the decoder module for better image deblurring. Experimental results show that the proposed method performs favorably against the state-of-the-art approaches. Code will be available at \url{https://github.com/kkkls/FFTformer}.
翻译:我们提出了一个有效且高效的方法,探索变异器在高品质图像变形的频率域域中的特性。我们的方法是由变异理论驱动的,即空间域中两个信号的关联性或变异性相当于其在频率域中的元素产物。这激励我们开发一个高效频域基于域的自我注意求解器(FSAS),以通过元素错位产品操作而不是空间域中的矩阵倍增来估计定量的点产品关注度。此外,我们注意到,仅仅在变异器中使用天真的饲料向前网络(FFN)不会产生良好的变异结果。为了克服这一问题,我们提出了一个简单且有效的区分性频域内两个信号在频率域域域内产生的元素产物。我们在此基于联合摄影专家组(JPEG)的压缩算法,以便有区别地确定这些特征的哪些低频和高频信息应当保存在隐性清晰图像恢复中。我们把拟议的FSASAS和DFFN编成一个不对称的网络,其基础是用来显示FCO-dealdroder模型的更好方法。