Swin 向 Win 递进: 简单窗口的变换器, 没有光速操作 (Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated Operations) - 专知论文

会员服务 ·

0

变换 · Vision · Swin Transformer · Microsoft Windows · 感受野 ·

2022 年 11 月 25 日

Degenerate Swin to Win: Plain Window-based Transformer without Sophisticated Operations

翻译：Swin 向 Win 递进: 简单窗口的变换器, 没有光速操作

The formidable accomplishment of Transformers in natural language processing has motivated the researchers in the computer vision community to build Vision Transformers. Compared with the Convolution Neural Networks (CNN), a Vision Transformer has a larger receptive field which is capable of characterizing the long-range dependencies. Nevertheless, the large receptive field of Vision Transformer is accompanied by the huge computational cost. To boost efficiency, the window-based Vision Transformers emerge. They crop an image into several local windows, and the self-attention is conducted within each window. To bring back the global receptive field, window-based Vision Transformers have devoted a lot of efforts to achieving cross-window communications by developing several sophisticated operations. In this work, we check the necessity of the key design element of Swin Transformer, the shifted window partitioning. We discover that a simple depthwise convolution is sufficient for achieving effective cross-window communications. Specifically, with the existence of the depthwise convolution, the shifted window configuration in Swin Transformer cannot lead to an additional performance improvement. Thus, we degenerate the Swin Transformer to a plain Window-based (Win) Transformer by discarding sophisticated shifted window partitioning. The proposed Win Transformer is conceptually simpler and easier for implementation than Swin Transformer. Meanwhile, our Win Transformer achieves consistently superior performance than Swin Transformer on multiple computer vision tasks, including image recognition, semantic segmentation, and object detection.

翻译：在自然语言处理过程中,变异器的巨大成就激励了计算机视觉界的研究人员建立愿景变异器。与进化神经网络相比,一个愿景变异器拥有一个更大的可接受域,能够描述长距离依赖性。然而,视野变异器的可接受域伴随着巨大的计算成本。为了提高效率,基于窗口的愿景变异器出现。它们将图像植入几个本地窗口,并在每个窗口内进行自我关注。为了让全球可接受域重新回到全球可接受域,基于窗口的愿景变异器通过开发若干复杂的操作,将大量精力用于实现跨窗口通信。在这项工作中,我们检查了Swin变异器(变异器)关键设计要素的必要性,改变窗口分割。我们发现简单的深度变异器足以实现有效的跨窗口通信。具体地说,随着深度变异变,Swin变变器的窗口配置变化无法导致进一步的性能改进。因此,我们把Swin变异器转换成基于简单窗口的变异器(Win Indeveloporate Transansforation),而不是通过更简单的变换S-chillerver 变换S-shistable 。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

ABCB6基因在眼组织缺损中的功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

调控YAP基因对银屑病角质形成细胞增殖的影响

国家自然科学基金

0+阅读 · 2013年12月31日

INF-γ通过CIITA调控PPARγ转录机制及其在2型糖尿病中意义的探讨

国家自然科学基金

0+阅读 · 2013年12月31日

CTGF基因低甲基化在肝星状细胞活化表型维持及肝纤维化形成中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

水稻转录因子OsWRKY1在维持磷素动态平衡过程中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

ATP依赖染色质重塑复合物SWI/SNF在MSCs平滑肌分化中的调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

乙烯反应转录因子OsERF2调控水稻根发育的分子基础

国家自然科学基金

0+阅读 · 2011年12月31日

驴Cathelicidin EA-CATH1的结构与功能研究及分子设计

国家自然科学基金

0+阅读 · 2009年12月31日

Bi-AM-RRT*: A Fast and Efficient Sampling-Based Motion Planning Algorithm in Dynamic Environments

Arxiv

0+阅读 · 2023年1月27日

Digital Twin-Based Multiple Access Optimization and Monitoring via Model-Driven Bayesian Learning

Digital Twin-Based Multiple Access Optimization and Monitoring via Model-Driven Bayesian Learning

Arxiv

0+阅读 · 2023年1月27日

Is Embodied Interaction Beneficial? A Study on Navigating Network Visualizations

Arxiv

0+阅读 · 2023年1月27日

Digital Inheritance in Web3: A Case Study of Soulbound Tokens and the Social Recovery Pallet within the Polkadot and Kusama Ecosystems

Arxiv

0+阅读 · 2023年1月26日

Digital Traces of Brain Drain: Developers during the Russian Invasion of Ukraine

Arxiv

0+阅读 · 2023年1月26日

Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text

Arxiv

0+阅读 · 2023年1月26日

Search-Based Task and Motion Planning for Hybrid Systems: Agile Autonomous Vehicles

Arxiv

0+阅读 · 2023年1月25日

Local Model Explanations and Uncertainty Without Model Access

Arxiv

0+阅读 · 2023年1月24日

AI for Next Generation Computing: Emerging Trends and Future Directions

Arxiv

19+阅读 · 2022年3月5日

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Arxiv

23+阅读 · 2019年11月5日

VIP会员

文章信息

相关主题

Swin Transformer

Microsoft Windows

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

面向性能、成本效益、云边隐私与可信性的大小语言模型协作综述

乌克兰太空研究（2022-2024年） | 176页

【CMU博士论文】大型语言模型的隐性特性

国防领域人工智能走向何方？

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Bi-AM-RRT*: A Fast and Efficient Sampling-Based Motion Planning Algorithm in Dynamic Environments

Arxiv

0+阅读 · 2023年1月27日

Digital Twin-Based Multiple Access Optimization and Monitoring via Model-Driven Bayesian Learning

Digital Twin-Based Multiple Access Optimization and Monitoring via Model-Driven Bayesian Learning

Arxiv

0+阅读 · 2023年1月27日

Is Embodied Interaction Beneficial? A Study on Navigating Network Visualizations

Arxiv

0+阅读 · 2023年1月27日

Digital Inheritance in Web3: A Case Study of Soulbound Tokens and the Social Recovery Pallet within the Polkadot and Kusama Ecosystems

Arxiv

0+阅读 · 2023年1月26日

Digital Traces of Brain Drain: Developers during the Russian Invasion of Ukraine

Arxiv

0+阅读 · 2023年1月26日

Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text

Arxiv

0+阅读 · 2023年1月26日

Search-Based Task and Motion Planning for Hybrid Systems: Agile Autonomous Vehicles

Arxiv

0+阅读 · 2023年1月25日

Local Model Explanations and Uncertainty Without Model Access

Arxiv

0+阅读 · 2023年1月24日

AI for Next Generation Computing: Emerging Trends and Future Directions

Arxiv

19+阅读 · 2022年3月5日

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks

Arxiv

23+阅读 · 2019年11月5日

相关基金

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

ABCB6基因在眼组织缺损中的功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

调控YAP基因对银屑病角质形成细胞增殖的影响

国家自然科学基金

0+阅读 · 2013年12月31日

INF-γ通过CIITA调控PPARγ转录机制及其在2型糖尿病中意义的探讨

国家自然科学基金

0+阅读 · 2013年12月31日

CTGF基因低甲基化在肝星状细胞活化表型维持及肝纤维化形成中的作用研究

国家自然科学基金

0+阅读 · 2013年12月31日

水稻转录因子OsWRKY1在维持磷素动态平衡过程中的功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Arisandilactone A 的不对称全合成

国家自然科学基金

0+阅读 · 2012年12月31日

ATP依赖染色质重塑复合物SWI/SNF在MSCs平滑肌分化中的调控机制

国家自然科学基金

0+阅读 · 2011年12月31日

乙烯反应转录因子OsERF2调控水稻根发育的分子基础

国家自然科学基金

0+阅读 · 2011年12月31日

驴Cathelicidin EA-CATH1的结构与功能研究及分子设计

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员