展望变形人的彩票票假说 (The Lottery Ticket Hypothesis for Vision Transformers) - 专知论文

会员服务 ·

0

INFORMS · 变换 · Vision · 随机初始化 · SimPLe ·

2022 年 11 月 2 日

The Lottery Ticket Hypothesis for Vision Transformers

翻译：展望变形人的彩票票假说

Xuan Shen,Zhenglun Kong,Minghai Qin,Peiyan Dong,Geng Yuan,Xin Meng,Hao Tang,Xiaolong Ma,Yanzhi Wang

The conventional lottery ticket hypothesis (LTH) claims that there exists a sparse subnetwork within a dense neural network and a proper random initialization method, called the winning ticket, such that it can be trained from scratch to almost as good as the dense counterpart. Meanwhile, the research of LTH in vision transformers (ViTs) is scarcely evaluated. In this paper, we first show that the conventional winning ticket is hard to find at weight level of ViTs by existing methods. Then, we generalize the LTH for ViTs to input images consisting of image patches inspired by the input dependence of ViTs. That is, there exists a subset of input image patches such that a ViT can be trained from scratch by using only this subset of patches and achieve similar accuracy to the ViTs trained by using all image patches. We call this subset of input patches the winning tickets, which represent a significant amount of information in the input. Furthermore, we present a simple yet effective method to find the winning tickets in input patches for various types of ViT, including DeiT, LV-ViT, and Swin Transformers. More specifically, we use a ticket selector to generate the winning tickets based on the informativeness of patches. Meanwhile, we build another randomly selected subset of patches for comparison, and the experiments show that there is clear difference between the performance of models trained with winning tickets and randomly selected subsets.

翻译：常规彩票假设( LTH) 声称在浓密的神经网络和适当的随机初始化方法中存在一个稀疏的子网络, 叫做中奖票, 这样它就可以从零到几乎与稠密的对口单位一样受到训练。同时, 对视觉变压器( VITs) LTH 的研究很少进行评估。在本文中, 我们首先显示, 以现有方法在 VIT 重量水平上很难找到常规中胜票。然后, 我们将 VIT 的 LTH 推广到输入图像的随机补丁中, 包括受 VIT 输入依赖的图像补丁。也就是说, 存在一组输入图像补丁, 这样 VIT 就可以通过只使用这组补丁来从零到几乎和紧凑来训练。我们称这组的中奖票是代表了输入中大量信息。此外, 我们提出了一个简单有效的方法, 来寻找各种 VIT 的输入补接合。包括 DeiT、 LVV- ViT VIT 和 Swin 变压的分级票, 更具体地是, 我们选择了另一个赢票的补票的补票, 的补票的补票的补票, 以赢得的折叠。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

基于DSM的建筑密集区域InSAR地形去除和相位解缠

国家自然科学基金

1+阅读 · 2015年12月31日

β2-AR /PCBP2相互作用在胰腺癌发生发展中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

ERBB4 3'非翻译区致病变异的发现及其在慢性HBV感染和肝细胞癌中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

Kahler 曲面中特殊曲面的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于双芯光纤的分布式干涉测量方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

极大似然minwise哈希估计子研究

国家自然科学基金

0+阅读 · 2013年12月31日

MiR-181c对心脏室间隔缺损发生中心肌细胞功能紊乱的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

抛物型Monge-Ampere方程的外问题与多值解

国家自然科学基金

0+阅读 · 2012年12月31日

Par-4在hTERT非端粒酶活性依赖抗凋亡中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

Co基磁性Heusler合金相关体系相图与化合物的结构与性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

FFNeRV: Flow-Guided Frame-Wise Neural Representations for Videos

Arxiv

0+阅读 · 2022年12月23日

PanoViT: Vision Transformer for Room Layout Estimation from a Single Panoramic Image

Arxiv

0+阅读 · 2022年12月23日

MetaFormer Baselines for Vision

Arxiv

0+阅读 · 2022年12月22日

Unlocking the potential of two-point cells for energy-efficient and resilient training of deep nets

Arxiv

0+阅读 · 2022年12月22日

Metadata-guided Consistency Learning for High Content Images

Arxiv

0+阅读 · 2022年12月22日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

VIP会员

文章信息

相关主题

随机初始化

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

最新《Transformers模型》教程，64页ppt

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

现代战争的杀伤区：规模结构、控制手段、生存与战线转移

中文版 | 人工智能时代的任务式指挥

中文版 | 数据投毒：AI驱动战争中优势地位的隐蔽武器

以色列在加沙战争部署新型军事人工智能

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

FFNeRV: Flow-Guided Frame-Wise Neural Representations for Videos

Arxiv

0+阅读 · 2022年12月23日

PanoViT: Vision Transformer for Room Layout Estimation from a Single Panoramic Image

Arxiv

0+阅读 · 2022年12月23日

MetaFormer Baselines for Vision

Arxiv

0+阅读 · 2022年12月22日

Unlocking the potential of two-point cells for energy-efficient and resilient training of deep nets

Arxiv

0+阅读 · 2022年12月22日

Metadata-guided Consistency Learning for High Content Images

Arxiv

0+阅读 · 2022年12月22日

Transformers in Time Series: A Survey

Arxiv

34+阅读 · 2022年2月15日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Arxiv

12+阅读 · 2021年12月30日

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Arxiv

21+阅读 · 2020年12月17日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

Learning in the Frequency Domain

Learning in the Frequency Domain

Arxiv

11+阅读 · 2020年3月12日

相关基金

基于DSM的建筑密集区域InSAR地形去除和相位解缠

国家自然科学基金

1+阅读 · 2015年12月31日

β2-AR /PCBP2相互作用在胰腺癌发生发展中的作用及机制

国家自然科学基金

0+阅读 · 2014年12月31日

ERBB4 3'非翻译区致病变异的发现及其在慢性HBV感染和肝细胞癌中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

Kahler 曲面中特殊曲面的研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于双芯光纤的分布式干涉测量方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

极大似然minwise哈希估计子研究

国家自然科学基金

0+阅读 · 2013年12月31日

MiR-181c对心脏室间隔缺损发生中心肌细胞功能紊乱的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

抛物型Monge-Ampere方程的外问题与多值解

国家自然科学基金

0+阅读 · 2012年12月31日

Par-4在hTERT非端粒酶活性依赖抗凋亡中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

Co基磁性Heusler合金相关体系相图与化合物的结构与性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员