Convormer:消除有线电视新闻网与愿景变异者之间的差距 (ConvFormer: Closing the Gap Between CNN and Vision Transformers) - 专知论文

会员服务 ·

0

Vision · CNN · Performer · 变换 · Extensibility ·

2022 年 9 月 16 日

ConvFormer: Closing the Gap Between CNN and Vision Transformers

翻译：Convormer:消除有线电视新闻网与愿景变异者之间的差距

Zimian Wei,Hengyue Pan,Xin Niu,Dongsheng Li

Vision transformers have shown excellent performance in computer vision tasks. However, the computation cost of their (local) self-attention mechanism is expensive. Comparatively, CNN is more efficient with built-in inductive bias. Recent works show that CNN is promising to compete with vision transformers by learning their architecture design and training protocols. Nevertheless, existing methods either ignore multi-level features or lack dynamic prosperity, leading to sub-optimal performance. In this paper, we propose a novel attention mechanism named MCA, which captures different patterns of input images by multiple kernel sizes and enables input-adaptive weights with a gating mechanism. Based on MCA, we present a neural network named ConvFormer. ConvFormer adopts the general architecture of vision transformers, while replacing the (local) self-attention mechanism with our proposed MCA. Extensive experimental results demonstrated that ConvFormer outperforms similar size vision transformers(ViTs) and convolutional neural networks (CNNs) in various tasks. For example, ConvFormer-S, ConvFormer-L achieve state-of-the-art performance of 82.8%, 83.6% top-1 accuracy on ImageNet dataset. Moreover, ConvFormer-S outperforms Swin-T by 1.5 mIoU on ADE20K, and 0.9 bounding box AP on COCO with a smaller model size. Code and models will be available.

翻译：视觉变异器在计算机视觉任务中表现良好。然而, 视觉变异器在计算机视觉任务中表现出了出色的表现。然而, 他们的( 本地) 自我注意机制的计算成本是昂贵的。比较而言, CNN 的计算成本比较而言, CNN 具有内在的感应偏向性。最近的工作表明, CNN 有希望与视觉变异器竞争, 学习其建筑设计和培训协议。然而, 现有的方法要么忽视多层次特征, 要么缺乏动态繁荣, 导致亚于最佳的性能。在本文件中, 我们提议一个新的关注机制, 名为MCA, 以多个内核大小的输入图像模式( 本地), 并使用一个加固机制进行输入适应加权。在MCA 的基础上, 我们介绍一个名为ConvFormer- Former 的神经网络网络网络网络网络网络网络网络。 Convorformer 采用通用的常规S- 85- formard S- formal- form- formal Formal- dismard Seral- fal- frieward Serma- mard Seral- fal- fal- fal- fal- fal- mard Seral- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- sal- fal- fal- sal- =8xal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- sal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal- fal-

0

相关内容

Vision

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

专知会员服务

44+阅读 · 2021年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

广义多项式混沌方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

高离化率磁控溅射薄膜制备及形成机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型抗癌双核铂配合物的设计、合成及构效关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

Pt/Heusler合金/MgO基垂直磁各向异性薄膜的制备及磁各向异性机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Rydberg Blockade条件下的量子相干与量子信息处理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

高固溶度Mg-RE二元合金塑性变形机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

大麻素WIN靶向PPARγ22522;因抗肝细胞癌增殖及其信号转导通路研究

国家自然科学基金

0+阅读 · 2011年12月31日

透明室温铁磁半导体Zn1-xErxO的制备及磁性机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

空位掺杂对Heusler合金磁性和电子结构的调控

国家自然科学基金

0+阅读 · 2009年12月31日

Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering

Arxiv

1+阅读 · 2022年10月26日

A Unified Model for Multi-class Anomaly Detection

Arxiv

0+阅读 · 2022年10月25日

Pushing the Efficiency Limit Using Structured Sparse Convolutions

Arxiv

0+阅读 · 2022年10月23日

Understanding The Robustness in Vision Transformers

Arxiv

0+阅读 · 2022年10月21日

Face Pyramid Vision Transformer

Arxiv

0+阅读 · 2022年10月21日

LittleBird: Efficient Faster & Longer Transformer for Question Answering

Arxiv

0+阅读 · 2022年10月21日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

A Survey of Transformers

Arxiv

103+阅读 · 2021年6月8日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

【CVPR2021】基于Transformers 从序列到序列的角度重新思考语义分割

专知会员服务

44+阅读 · 2021年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

321+阅读 · 2020年11月26日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美国海军陆战队软件定义网络应用案例：分布式防火墙自动化系统》148页

《多体环境下定位导航授时（PNT）系统研究》228页

软件定义无线电（SDR）：商业与军事领域的技术、应用及未来趋势

《攻势防空作战中无人追击者/规避者最优轨迹研究（含动态交战区建模）》95页

相关资讯

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

【论文推荐】最新七篇图像分割相关论文—Attention U-Net、对抗结构匹配损失、卷积CRFs、对抗样本、弱监督分割

专知

19+阅读 · 2018年5月31日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering

Arxiv

1+阅读 · 2022年10月26日

A Unified Model for Multi-class Anomaly Detection

Arxiv

0+阅读 · 2022年10月25日

Pushing the Efficiency Limit Using Structured Sparse Convolutions

Arxiv

0+阅读 · 2022年10月23日

Understanding The Robustness in Vision Transformers

Arxiv

0+阅读 · 2022年10月21日

Face Pyramid Vision Transformer

Arxiv

0+阅读 · 2022年10月21日

LittleBird: Efficient Faster & Longer Transformer for Question Answering

Arxiv

0+阅读 · 2022年10月21日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

A Survey of Transformers

Arxiv

103+阅读 · 2021年6月8日

相关基金

广义多项式混沌方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

高离化率磁控溅射薄膜制备及形成机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型抗癌双核铂配合物的设计、合成及构效关系研究

国家自然科学基金

0+阅读 · 2012年12月31日

Pt/Heusler合金/MgO基垂直磁各向异性薄膜的制备及磁各向异性机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Rydberg Blockade条件下的量子相干与量子信息处理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

高固溶度Mg-RE二元合金塑性变形机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

大麻素WIN靶向PPARγ22522;因抗肝细胞癌增殖及其信号转导通路研究

国家自然科学基金

0+阅读 · 2011年12月31日

透明室温铁磁半导体Zn1-xErxO的制备及磁性机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

空位掺杂对Heusler合金磁性和电子结构的调控

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员