从 Tok 分化观点提高愿景转换器的效率</s> (Making Vision Transformers Efficient from A Token Sparsification View) - 专知论文

会员服务 ·

0

词元分析器 · Vision · 稀疏化 · 模型评估 · Networking ·

2023 年 3 月 15 日

Making Vision Transformers Efficient from A Token Sparsification View

翻译：从 Tok 分化观点提高愿景转换器的效率

Shuning Chang,Pichao Wang,Ming Lin,Fan Wang,David Junhao Zhang,Rong Jin,Mike Zheng Shou

from arxiv, Accepted by CVPR2023

The quadratic computational complexity to the number of tokens limits the practical applications of Vision Transformers (ViTs). Several works propose to prune redundant tokens to achieve efficient ViTs. However, these methods generally suffer from (i) dramatic accuracy drops, (ii) application difficulty in the local vision transformer, and (iii) non-general-purpose networks for downstream tasks. In this work, we propose a novel Semantic Token ViT (STViT), for efficient global and local vision transformers, which can also be revised to serve as backbone for downstream tasks. The semantic tokens represent cluster centers, and they are initialized by pooling image tokens in space and recovered by attention, which can adaptively represent global or local semantic information. Due to the cluster properties, a few semantic tokens can attain the same effect as vast image tokens, for both global and local vision transformers. For instance, only 16 semantic tokens on DeiT-(Tiny,Small,Base) can achieve the same accuracy with more than 100% inference speed improvement and nearly 60% FLOPs reduction; on Swin-(Tiny,Small,Base), we can employ 16 semantic tokens in each window to further speed it up by around 20% with slight accuracy increase. Besides great success in image classification, we also extend our method to video recognition. In addition, we design a STViT-R(ecover) network to restore the detailed spatial information based on the STViT, making it work for downstream tasks, which is powerless for previous token sparsification methods. Experiments demonstrate that our method can achieve competitive results compared to the original networks in object detection and instance segmentation, with over 30% FLOPs reduction for backbone.

翻译：等离子计算复杂度限制了视觉变异器( ViTs) 的实际应用。数个作品提议将冗余的符号刻录为多余的符号, 以高效的 ViTs 。但是, 这些方法一般会受到以下因素的影响:( 一) 急剧的精确度下降, (二) 本地视觉变异器的应用困难, (三) 下游任务的非通用网络。在这项工作中, 我们建议为高效的全球和地方视觉变异器( STVT) 建立一个新型的Semanic Token ViT (STVT) (STVT) (STVit), 也可以修改为下游任务的主干线。语义符号代表集中心, 并且它们通过在空间中存储下流的图像变异性图像标记, 并且通过关注来初始化地显示全球或本地的语系信息。由于集特性, 少数语系符号标记可以达到与全球和本地变异性变异的图像值。例如, 在DeT- (Ti, Smary Stal) 格式变异性变现中, 我们的Stary Stal Stal Stal Stalifilt) 可以变换了20 的方法可以进一步变现到我们变现为Sestr 。</s>

0

相关内容

词元分析器

词元分析器

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【ACL2022】一种基于三阶张量同构的高效实体对齐译码算法, An Effective and Efficient Entity Alignment Decoding Algorithm via Third-Order Tensor Isomorphism

【ACL2022】一种基于三阶张量同构的高效实体对齐译码算法, An Effective and Efficient Entity Alignment Decoding Algorithm via Third-Order Tensor Isomorphism

专知会员服务

13+阅读 · 2022年3月24日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于分子靶点的玉米孤雌生殖单倍体诱导性状高效检测技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于分布式视觉的移动机器人大规模室内环境认知方法研究

国家自然科学基金

3+阅读 · 2012年12月31日

表面等离激元增强宽光谱InGaN太阳能电池研究

国家自然科学基金

0+阅读 · 2012年12月31日

PEDOT:PSS/碳纳米管/石墨烯量子点复合有序纳米纤维阵列的制备及热电性能

国家自然科学基金

0+阅读 · 2012年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

番茄广谱胁迫蛋白SlUSP介导抗灰霉病的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

氧化物热电材料的低温热电输运性质

国家自然科学基金

0+阅读 · 2011年12月31日

Cayley图的匹配可扩性和semi-Cayley图的谱

国家自然科学基金

0+阅读 · 2011年12月31日

高压提高Ga-Ge基笼型物热电效率的第一性原理研究

国家自然科学基金

0+阅读 · 2009年12月31日

ViT-Calibrator: Decision Stream Calibration for Vision Transformer

Arxiv

0+阅读 · 2023年5月5日

OctFormer: Octree-based Transformers for 3D Point Clouds

Arxiv

0+阅读 · 2023年5月4日

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Arxiv

11+阅读 · 2022年12月26日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Recent Advances of Continual Learning in Computer Vision: An Overview

Recent Advances of Continual Learning in Computer Vision: An Overview

Arxiv

22+阅读 · 2021年9月23日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

VIP会员

文章信息

相关主题

词元分析器

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

70+阅读 · 2023年3月31日

【2022新书】高效深度学习，Efficient Deep Learning Book

【2022新书】高效深度学习，Efficient Deep Learning Book

专知会员服务

125+阅读 · 2022年4月21日

【ACL2022】一种基于三阶张量同构的高效实体对齐译码算法, An Effective and Efficient Entity Alignment Decoding Algorithm via Third-Order Tensor Isomorphism

【ACL2022】一种基于三阶张量同构的高效实体对齐译码算法, An Effective and Efficient Entity Alignment Decoding Algorithm via Third-Order Tensor Isomorphism

专知会员服务

13+阅读 · 2022年3月24日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

324+阅读 · 2020年11月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】《知识图谱与大语言模型的协同应用》，544页pdf

军事通信系统：安全行动的支柱

《缓解大语言模型（LLMs）幻觉：面向应用的检索增强生成（RAG）、推理与智能体系统综述》

【新书】机器学习系统，2620页pdf

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

ViT-Calibrator: Decision Stream Calibration for Vision Transformer

Arxiv

0+阅读 · 2023年5月5日

OctFormer: Octree-based Transformers for 3D Point Clouds

Arxiv

0+阅读 · 2023年5月4日

VQA and Visual Reasoning: An Overview of Recent Datasets, Methods and Challenges

Arxiv

11+阅读 · 2022年12月26日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Recent Advances of Continual Learning in Computer Vision: An Overview

Recent Advances of Continual Learning in Computer Vision: An Overview

Arxiv

22+阅读 · 2021年9月23日

A Survey of Quantization Methods for Efficient Neural Network Inference

Arxiv

22+阅读 · 2021年6月21日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Arxiv

13+阅读 · 2021年1月5日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

相关基金

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于分子靶点的玉米孤雌生殖单倍体诱导性状高效检测技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于分布式视觉的移动机器人大规模室内环境认知方法研究

国家自然科学基金

3+阅读 · 2012年12月31日

表面等离激元增强宽光谱InGaN太阳能电池研究

国家自然科学基金

0+阅读 · 2012年12月31日

PEDOT:PSS/碳纳米管/石墨烯量子点复合有序纳米纤维阵列的制备及热电性能

国家自然科学基金

0+阅读 · 2012年12月31日

Skutterudite/AgSbTe2系纳米复合热电材料研究

国家自然科学基金

0+阅读 · 2012年12月31日

番茄广谱胁迫蛋白SlUSP介导抗灰霉病的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

氧化物热电材料的低温热电输运性质

国家自然科学基金

0+阅读 · 2011年12月31日

Cayley图的匹配可扩性和semi-Cayley图的谱

国家自然科学基金

0+阅读 · 2011年12月31日

高压提高Ga-Ge基笼型物热电效率的第一性原理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员