Transformer is beneficial for image denoising tasks since it can model long-range dependencies to overcome the limitations presented by inductive convolutional biases. However, directly applying the transformer structure to remove noise is challenging because its complexity grows quadratically with the spatial resolution. In this paper, we propose an efficient Dual-branch Deformable Transformer (DDT) denoising network which captures both local and global interactions in parallel. We divide features with a fixed patch size and a fixed number of patches in local and global branches, respectively. In addition, we apply deformable attention operation in both branches, which helps the network focus on more important regions and further reduces computational complexity. We conduct extensive experiments on real-world and synthetic denoising tasks, and the proposed DDT achieves state-of-the-art performance with significantly fewer computational costs.
翻译:Transformer对于图像去噪任务有益,因为它可以模拟长程依赖关系以克服感性卷积偏差所带来的限制。然而,直接将Transformer的结构应用于去噪是具有挑战性的,因为其复杂度随空间分辨率的增加呈二次增长。因此,本文提出了一种高效的双分支变形Transformer去噪网络,可以同时捕获本地和全局交互。我们将特征划分为具有固定大小的块和固定数量的块的局部和全局分支。此外,我们在两个分支中都应用了可变变形的注意力运算,这有助于网络集中关注更重要的区域,并进一步降低计算复杂度。我们在真实世界和合成去噪任务上进行了大量实验,提出的DDT用较少的计算成本实现了最先进的性能。