交叉聚合变形金刚用于图像恢复 (Cross Aggregation Transformer for Image Restoration) - 专知论文

会员服务 ·

0

图像还原 · 变换 · Microsoft Windows · Attention · CNN ·

2023 年 3 月 23 日

Cross Aggregation Transformer for Image Restoration

翻译：交叉聚合变形金刚用于图像恢复

Zheng Chen,Yulun Zhang,Jinjin Gu,Yongbing Zhang,Linghe Kong,Xin Yuan

from arxiv, Accepted to NeurIPS 2022. Code is available at https://github.com/zhengchen1999/CAT

Recently, Transformer architecture has been introduced into image restoration to replace convolution neural network (CNN) with surprising results. Considering the high computational complexity of Transformer with global attention, some methods use the local square window to limit the scope of self-attention. However, these methods lack direct interaction among different windows, which limits the establishment of long-range dependencies. To address the above issue, we propose a new image restoration model, Cross Aggregation Transformer (CAT). The core of our CAT is the Rectangle-Window Self-Attention (Rwin-SA), which utilizes horizontal and vertical rectangle window attention in different heads parallelly to expand the attention area and aggregate the features cross different windows. We also introduce the Axial-Shift operation for different window interactions. Furthermore, we propose the Locality Complementary Module to complement the self-attention mechanism, which incorporates the inductive bias of CNN (e.g., translation invariance and locality) into Transformer, enabling global-local coupling. Extensive experiments demonstrate that our CAT outperforms recent state-of-the-art methods on several image restoration applications. The code and models are available at https://github.com/zhengchen1999/CAT.

翻译：最近，变形金刚模型（Transformer）架构已经被引入到图像恢复中，以替代卷积神经网络（CNN），并取得了出色的效果。考虑到全局注意力使变形金刚的计算复杂度很高，因此一些方法使用本地方形窗口来限制自注意的范围。然而，这些方法缺乏不同窗口之间的直接交互，这限制了建立长程依赖关系的能力。为了解决上述问题，本文提出了一种新的图像恢复模型，即交叉聚合变形金刚（Cross Aggregation Transformer, CAT)。CAT的核心是矩形窗口自注意模块（Rwin-SA），它使用不同的水平和垂直的矩形窗口注意力层平行地扩大了注意力范围并在不同窗口之间进行特征聚合。还介绍了轴向位移操作来处理不同窗口之间的交互。此外，我们提出了局部补充模块来补充自注意机制，其中包括CNN的归纳偏差（例如平移不变性和局部性）进入变形金刚模型，从而实现全局与局部的耦合。广泛的实验表明，CAT在几个图像恢复应用中优于最近的最先进方法。代码和模型可在https://github.com/zhengchen1999/CAT获得。

1

相关内容

图像还原

用于识别任务的视觉 Transformer 综述

用于识别任务的视觉 Transformer 综述

专知会员服务

74+阅读 · 2023年2月25日

Transformer如何用于3D视觉？阿联酋MBZUAI最新《3D视觉Transformers处理》综述，涵盖100+种方法

Transformer如何用于3D视觉？阿联酋MBZUAI最新《3D视觉Transformers处理》综述，涵盖100+种方法

专知会员服务

39+阅读 · 2022年8月9日

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

专知会员服务

17+阅读 · 2022年3月28日

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

专知会员服务

11+阅读 · 2022年3月19日

【CVPR 2022】MixFormer：跨窗口与维度的特征融合，MixFormer: Mixing Features across Windows and Dimensions

【CVPR 2022】MixFormer：跨窗口与维度的特征融合，MixFormer: Mixing Features across Windows and Dimensions

专知会员服务

15+阅读 · 2022年3月19日

图像分割二十年，盘点影响力最大的10篇论文

图像分割二十年，盘点影响力最大的10篇论文

专知会员服务

45+阅读 · 2022年2月7日

Transformer如何用于视频？最新「视频Transformer」2022综述

Transformer如何用于视频？最新「视频Transformer」2022综述

专知会员服务

76+阅读 · 2022年1月20日

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

专知会员服务

42+阅读 · 2020年4月22日

【ICLR 2020】基于组合的多关系图卷积网络 Composition-Based Multi-Relational Graph Convolutional Networks

【ICLR 2020】基于组合的多关系图卷积网络 Composition-Based Multi-Relational Graph Convolutional Networks

专知会员服务

108+阅读 · 2020年3月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

当Non-local遇见SENet，微软亚研提出更高效的全局上下文网络

当Non-local遇见SENet，微软亚研提出更高效的全局上下文网络

机器之心

11+阅读 · 2019年5月6日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新六篇目标跟踪相关论文—双重Siamese网络、判别性相关滤波、多目标跟踪、深度多尺度时空判别性、综述、显著性增强

【论文推荐】最新六篇目标跟踪相关论文—双重Siamese网络、判别性相关滤波、多目标跟踪、深度多尺度时空判别性、综述、显著性增强

专知

34+阅读 · 2018年2月27日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

光流控基片微流道中的新型光源和变焦柱透镜研究

国家自然科学基金

0+阅读 · 2014年12月31日

湍流燃烧中火焰与流动相互作用的非局部与定量描述及其物理机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

弗兰克-康登效应在离子阱量子信息处理中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

蛋白质紫外共振拉曼光谱的QM/MM多尺度理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

NiMnSnCo磁制冷材料快速凝固过程及微观结构研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-17-92基因簇在细胞衰老过程中的转录调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Pseudomonas putida DLL-E4对硝基苯酚降解中的温敏与转录调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

面向图像复原的条件随机场模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

图像恢复和填补中的新的模型与算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation

Arxiv

0+阅读 · 2023年5月15日

Pyramid Fusion Transformer for Semantic Segmentation

Arxiv

0+阅读 · 2023年5月14日

AGFormer: Efficient Graph Representation with Anchor-Graph Transformer

Arxiv

0+阅读 · 2023年5月12日

A Multidimensional Graph Fourier Transformation Neural Network for Vehicle Trajectory Prediction

Arxiv

0+阅读 · 2023年5月12日

T-former: An Efficient Transformer for Image Inpainting

Arxiv

0+阅读 · 2023年5月12日

Stratified Adversarial Robustness with Rejection

Arxiv

0+阅读 · 2023年5月12日

Salient Mask-Guided Vision Transformer for Fine-Grained Classification

Arxiv

0+阅读 · 2023年5月11日

Bi-level Dynamic Learning for Jointly Multi-modality Image Fusion and Beyond

Arxiv

0+阅读 · 2023年5月11日

Mobile Image Restoration via Prior Quantization

Arxiv

0+阅读 · 2023年5月10日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

VIP会员

文章信息

相关主题

Microsoft Windows

相关VIP内容

用于识别任务的视觉 Transformer 综述

用于识别任务的视觉 Transformer 综述

专知会员服务

74+阅读 · 2023年2月25日

Transformer如何用于3D视觉？阿联酋MBZUAI最新《3D视觉Transformers处理》综述，涵盖100+种方法

Transformer如何用于3D视觉？阿联酋MBZUAI最新《3D视觉Transformers处理》综述，涵盖100+种方法

专知会员服务

39+阅读 · 2022年8月9日

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

【CVPR 2022】未知损坏的一体化图像恢复,All-In-One Image Restoration for Unknown Corruption

专知会员服务

17+阅读 · 2022年3月28日

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

【CVPR 2022】基于Transformer的图象风格化，StyTr2: Image Style Transfer with Transformers

专知会员服务

11+阅读 · 2022年3月19日

【CVPR 2022】MixFormer：跨窗口与维度的特征融合，MixFormer: Mixing Features across Windows and Dimensions

【CVPR 2022】MixFormer：跨窗口与维度的特征融合，MixFormer: Mixing Features across Windows and Dimensions

专知会员服务

15+阅读 · 2022年3月19日

图像分割二十年，盘点影响力最大的10篇论文

图像分割二十年，盘点影响力最大的10篇论文

专知会员服务

45+阅读 · 2022年2月7日

Transformer如何用于视频？最新「视频Transformer」2022综述

Transformer如何用于视频？最新「视频Transformer」2022综述

专知会员服务

76+阅读 · 2022年1月20日

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

【剑桥大学】图网络的主邻域聚合，Principal Neighbourhood Aggregation for Graph Nets

专知会员服务

42+阅读 · 2020年4月22日

【ICLR 2020】基于组合的多关系图卷积网络 Composition-Based Multi-Relational Graph Convolutional Networks

【ICLR 2020】基于组合的多关系图卷积网络 Composition-Based Multi-Relational Graph Convolutional Networks

专知会员服务

108+阅读 · 2020年3月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

当Non-local遇见SENet，微软亚研提出更高效的全局上下文网络

当Non-local遇见SENet，微软亚研提出更高效的全局上下文网络

机器之心

11+阅读 · 2019年5月6日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

【论文推荐】最新六篇目标跟踪相关论文—双重Siamese网络、判别性相关滤波、多目标跟踪、深度多尺度时空判别性、综述、显著性增强

【论文推荐】最新六篇目标跟踪相关论文—双重Siamese网络、判别性相关滤波、多目标跟踪、深度多尺度时空判别性、综述、显著性增强

专知

34+阅读 · 2018年2月27日

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

【论文推荐】最新7篇条件随机场（CRF）相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别

专知

15+阅读 · 2018年2月13日

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

【论文推荐】最新5篇图像描述生成（Image Caption）相关论文—情感、注意力机制、遥感图像、序列到序列、深度神经结构

专知

66+阅读 · 2018年1月31日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

相关论文

MaxViT-UNet: Multi-Axis Attention for Medical Image Segmentation

Arxiv

0+阅读 · 2023年5月15日

Pyramid Fusion Transformer for Semantic Segmentation

Arxiv

0+阅读 · 2023年5月14日

AGFormer: Efficient Graph Representation with Anchor-Graph Transformer

Arxiv

0+阅读 · 2023年5月12日

A Multidimensional Graph Fourier Transformation Neural Network for Vehicle Trajectory Prediction

Arxiv

0+阅读 · 2023年5月12日

T-former: An Efficient Transformer for Image Inpainting

Arxiv

0+阅读 · 2023年5月12日

Stratified Adversarial Robustness with Rejection

Arxiv

0+阅读 · 2023年5月12日

Salient Mask-Guided Vision Transformer for Fine-Grained Classification

Arxiv

0+阅读 · 2023年5月11日

Bi-level Dynamic Learning for Jointly Multi-modality Image Fusion and Beyond

Arxiv

0+阅读 · 2023年5月11日

Mobile Image Restoration via Prior Quantization

Arxiv

0+阅读 · 2023年5月10日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

相关基金

光流控基片微流道中的新型光源和变焦柱透镜研究

国家自然科学基金

0+阅读 · 2014年12月31日

湍流燃烧中火焰与流动相互作用的非局部与定量描述及其物理机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

弗兰克-康登效应在离子阱量子信息处理中的应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

蛋白质紫外共振拉曼光谱的QM/MM多尺度理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

NiMnSnCo磁制冷材料快速凝固过程及微观结构研究

国家自然科学基金

0+阅读 · 2012年12月31日

miR-17-92基因簇在细胞衰老过程中的转录调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

Pseudomonas putida DLL-E4对硝基苯酚降解中的温敏与转录调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

面向图像复原的条件随机场模型研究

国家自然科学基金

0+阅读 · 2009年12月31日

图像恢复和填补中的新的模型与算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员