超越惯性托肯:为高效的愿景转变者纳入 " 托眼重要性和多样性 " (Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers) - 专知论文

会员服务 ·

0

词元分析器 · 可约的 · 多样性 · Attention · Vision ·

2022 年 11 月 21 日

Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers

翻译：超越惯性托肯:为高效的愿景转变者纳入 " 托眼重要性和多样性 "

Sifan Long,Zhen Zhao,Jimin Pi,Shengsheng Wang,Jingdong Wang

Vision transformers have achieved significant improvements on various vision tasks but their quadratic interactions between tokens significantly reduce computational efficiency. Many pruning methods have been proposed to remove redundant tokens for efficient vision transformers recently. However, existing studies mainly focus on the token importance to preserve local attentive tokens but completely ignore the global token diversity. In this paper, we emphasize the cruciality of diverse global semantics and propose an efficient token decoupling and merging method that can jointly consider the token importance and diversity for token pruning. According to the class token attention, we decouple the attentive and inattentive tokens. In addition to preserving the most discriminative local tokens, we merge similar inattentive tokens and match homogeneous attentive tokens to maximize the token diversity. Despite its simplicity, our method obtains a promising trade-off between model complexity and classification accuracy. On DeiT-S, our method reduces the FLOPs by 35% with only a 0.2% accuracy drop. Notably, benefiting from maintaining the token diversity, our method can even improve the accuracy of DeiT-T by 0.1% after reducing its FLOPs by 40%.

翻译：视觉变压器在各种视觉任务上取得了显著的改进,但是它们之间的四面形相互作用大大降低了计算效率。许多修剪方法最近被提议为高效视觉变压器去除多余的象征物。然而,现有的研究主要侧重于保存当地注意的象征物的象征重要性,但完全忽略了全球象征性多样性。在本文中,我们强调多样化全球语义学的至关重要性,并提议一种高效的象征脱钩和合并方法,可以共同考虑象征性裁剪的象征重要性和多样性。根据阶级象征性的关注,我们分解了关注和不注意的象征物。除了保存最具有歧视性的当地象征物外,我们还将相似的惯用象征物合并起来,并匹配同质的注意象征物以最大限度地增加象征性多样性。尽管我们的方法很简单,但在模型复杂性和分类准确性之间却取得了有希望的折价折。关于DeiT-S,我们的方法将FLOPs减少35%,只有0.2%的精度下降。值得注意的是,由于维持象征性的多样性,我们的方法甚至能够提高DiT-T的精度,在将FLOP减少40%之后,提高0.1%的精确度。

0

相关内容

词元分析器

词元分析器

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

NLRP3/ASC/Caspase-1信号通路在大气细颗粒物促发动脉粥样硬化中的作用机制

国家自然科学基金

0+阅读 · 2015年12月31日

VSIG4抑制MHV-3病毒感染诱发的暴发型肝衰竭的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

节能光纤接入网的关键技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

莱菔硫烷在脊髓损伤后抑制神经细胞Pyroptosis的作用及其机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

几类Pfaffian图的结构性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于NF-κB信号通路研究vaspin与leptin在骨性关节炎中的拮抗作用及分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

哺乳动物体细胞克隆胚胎抗氧化机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

利用斑马鱼为模型研究miR-9对单核/巨噬细胞功能及其在动脉粥样硬化形成中的影响

国家自然科学基金

0+阅读 · 2012年12月31日

Corin介导的ANP活化在动脉粥样硬化形成及其炎症反应中的作用与机制

国家自然科学基金

0+阅读 · 2012年12月31日

慢性阻塞性肺病时肺加速老化的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Multimodal Frame-Scoring Transformer for Video Summarization

Arxiv

0+阅读 · 2023年1月20日

Model-based assessment of sampling protocols for infectious disease genomic surveillance

Arxiv

0+阅读 · 2023年1月19日

From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition

Arxiv

0+阅读 · 2023年1月19日

HCE: Improving Performance and Efficiency with Heterogeneously Compressed Neural Network Ensemble

Arxiv

0+阅读 · 2023年1月18日

Efficient Transformers: A Survey

Arxiv

35+阅读 · 2022年3月14日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Linguistically-Informed Self-Attention for Semantic Role Labeling

Arxiv

17+阅读 · 2018年8月28日

Self-Attention with Relative Position Representations

Arxiv

14+阅读 · 2018年3月6日

VIP会员

文章信息

相关主题

词元分析器

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Call for Nominations: 2022 Multimedia Prize Paper Award

Call for Nominations: 2022 Multimedia Prize Paper Award

CCF多媒体专委会

0+阅读 · 2022年2月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Multimodal Frame-Scoring Transformer for Video Summarization

Arxiv

0+阅读 · 2023年1月20日

Model-based assessment of sampling protocols for infectious disease genomic surveillance

Arxiv

0+阅读 · 2023年1月19日

From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition

Arxiv

0+阅读 · 2023年1月19日

HCE: Improving Performance and Efficiency with Heterogeneously Compressed Neural Network Ensemble

Arxiv

0+阅读 · 2023年1月18日

Efficient Transformers: A Survey

Arxiv

35+阅读 · 2022年3月14日

A Survey of Visual Transformers

Arxiv

39+阅读 · 2021年11月11日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

Linguistically-Informed Self-Attention for Semantic Role Labeling

Arxiv

17+阅读 · 2018年8月28日

Self-Attention with Relative Position Representations

Arxiv

14+阅读 · 2018年3月6日

相关基金

NLRP3/ASC/Caspase-1信号通路在大气细颗粒物促发动脉粥样硬化中的作用机制

国家自然科学基金

0+阅读 · 2015年12月31日

VSIG4抑制MHV-3病毒感染诱发的暴发型肝衰竭的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

节能光纤接入网的关键技术研究

国家自然科学基金

0+阅读 · 2014年12月31日

莱菔硫烷在脊髓损伤后抑制神经细胞Pyroptosis的作用及其机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

几类Pfaffian图的结构性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于NF-κB信号通路研究vaspin与leptin在骨性关节炎中的拮抗作用及分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

哺乳动物体细胞克隆胚胎抗氧化机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

利用斑马鱼为模型研究miR-9对单核/巨噬细胞功能及其在动脉粥样硬化形成中的影响

国家自然科学基金

0+阅读 · 2012年12月31日

Corin介导的ANP活化在动脉粥样硬化形成及其炎症反应中的作用与机制

国家自然科学基金

0+阅读 · 2012年12月31日

慢性阻塞性肺病时肺加速老化的分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员