Image restoration of snow scenes in severe weather is a difficult task. Snow images have complex degradations and are cluttered over clean images, changing the distribution of clean images. The previous methods based on CNNs are challenging to remove perfectly in restoring snow scenes due to their local inductive biases' lack of a specific global modeling ability. In this paper, we apply the vision transformer to the task of snow removal from a single image. Specifically, we propose a parallel network architecture split along the channel, performing local feature refinement and global information modeling separately. We utilize a channel shuffle operation to combine their respective strengths to enhance network performance. Second, we propose the MSP module, which utilizes multi-scale avgpool to aggregate information of different sizes and simultaneously performs multi-scale projection self-attention on multi-head self-attention to improve the representation ability of the model under different scale degradations. Finally, we design a lightweight and simple local capture module, which can refine the local capture capability of the model. In the experimental part, we conduct extensive experiments to demonstrate the superiority of our method. We compared the previous snow removal methods on three snow scene datasets. The experimental results show that our method surpasses the state-of-the-art methods with fewer parameters and computation. We achieve substantial growth by 1.99dB and SSIM 0.03 on the CSD test dataset. On the SRRS and Snow100K datasets, we also increased PSNR by 2.47dB and 1.62dB compared with the Transweather approach and improved by 0.03 in SSIM. In the visual comparison section, our MSP-Former also achieves better visual effects than existing methods, proving the usability of our method.
翻译:在恶劣天气中, 雪景图像的恢复是一项艰巨的任务。 雪色图像具有复杂的退化性, 并且被清洁图像拼凑在一起, 从而改变清洁图像的分布。 先前基于CNN 的方法具有挑战性, 要完全清除恢复雪色, 因为本地的感应偏差缺乏特定的全球建模能力。 在本文中, 我们应用视觉变压器从一个图像中清除雪的工作。 具体地说, 我们建议沿着频道分割一个平行的网络结构, 进行本地地貌改进和全球信息建模。 我们使用一个频道打拼操作, 来结合它们各自的强项来提高网络的性能。 其次, 我们提议MSP 模块, 利用多尺度的气流源组合来汇总不同大小的雪景场景场景场景信息, 同时对多头自我意识进行多尺度的投影自我意识, 以提高模型在不同规模变形的表象能力。 最后, 我们设计一个轻度和简单的本地捕捉捉捉模模块, 可以改进模型的本地捕捉捉摸能力。 在实验部分, 我们进行广泛的实验, 展示方法, 我们的方法, 我们的雪色变雪色变色变色图图比 我们的SDRVB 3的SW- 的SD