Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by $\textbf{up to 0.14$\sim$0.45dB}$, while the total number of parameters can be reduced by $\textbf{up to 67%}$.
翻译:图像恢复是一个长期存在的低水平图像问题,目的是从低质量图像中恢复高质量图像(例如,降尺度、噪音和压缩图像)。尽管最先进的图像恢复方法以共振神经网络为基础,但与在高水平图像任务上表现令人印象深刻的变异器相比,几乎没有尝试过。在本文中,我们提出了一个基于 Swin 变异器的图像恢复的强有力的基线SwinIR模型。 SwinIR 由三个部分组成: 浅色地貌提取、 深色地貌提取和高质量图像重建。 特别是, 深地物提取模块由几个残余的 Swin 变异器块组成, 每个区都有几个Swin 变异器层和一个剩余连接。 我们对三项具有代表性的任务进行了实验: 图像超分辨率( 包括古典、 光量和真实世界图像超分辨率分辨率 ) 、 图像分解( 包括灰度和颜色图像脱色图像脱色) 和 JPEG 压缩工艺品减少。 实验结果显示, SwinIR 超越了不同任务的最新艺术方法,由 $\ textb_____xxxxxxx_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx