OVO:以在线蒸馏方式单射视觉变形器搜索 (OVO: One-shot Vision Transformer Search with Online distillation) - 专知论文

会员服务 ·

0

蒸馏 · 变换 · Student Networks · Vision · 在线 ·

2022 年 12 月 28 日

OVO: One-shot Vision Transformer Search with Online distillation

翻译：OVO:以在线蒸馏方式单射视觉变形器搜索

Zimian Wei,Hengyue Pan,Xin Niu,Dongsheng Li

from arxiv, arXiv admin note: substantial text overlap with arXiv:2107.00651 by other authors

Pure transformers have shown great potential for vision tasks recently. However, their accuracy in small or medium datasets is not satisfactory. Although some existing methods introduce a CNN as a teacher to guide the training process by distillation, the gap between teacher and student networks would lead to sub-optimal performance. In this work, we propose a new One-shot Vision transformer search framework with Online distillation, namely OVO. OVO samples sub-nets for both teacher and student networks for better distillation results. Benefiting from the online distillation, thousands of subnets in the supernet are well-trained without extra finetuning or retraining. In experiments, OVO-Ti achieves 73.32% top-1 accuracy on ImageNet and 75.2% on CIFAR-100, respectively.

翻译：纯变压器最近表现出了巨大的视觉任务潜力,然而,其在中小数据集中的准确性并不令人满意。虽然一些现有方法引入了CNN作为教师,通过蒸馏来指导培训过程,但教师与学生网络之间的差距会导致业绩低于最佳水平。在这项工作中,我们提议建立一个新的单射光的视觉变压器搜索框架,通过在线蒸馏,即OVO。OVO为教师和学生网络样本子网,以便取得更好的蒸馏结果。从网上蒸馏中受益,但超网的数千个子网经过良好训练,没有额外的微调或再培训。在实验中,OVO-Ti在图像网络上分别实现了73.32%的顶端-1的精确度,在CIFAR-100上实现了75.2%。

0

相关内容

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

【如何做研究】How to research ，22页ppt

【如何做研究】How to research ，22页ppt

专知会员服务

112+阅读 · 2021年4月17日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

浅水湖泊草源型有色可溶有机物（CDOM）的生消过程特征分析

国家自然科学基金

0+阅读 · 2015年12月31日

HOTAIR/miR-326/SP1调控通路对非小细胞肺癌增殖、迁移和侵袭能力的影响及作用机制

国家自然科学基金

0+阅读 · 2015年12月31日

HOXA5通过CHOP介导的凋亡途径抑制胆管癌的增殖作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

重离子储存环CSRe上激光冷却相对论能量类锂12C3+离子束的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

HONO来源对京津冀大气中二次有机气溶胶浓度影响的模拟研究

国家自然科学基金

0+阅读 · 2013年12月31日

PGE2/EP2介导间充质干细胞向急性肺损伤肺组织归巢的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型核酸适体-siRNA嵌合体的靶向增敏机制及前列腺癌靶向治疗的研究

国家自然科学基金

0+阅读 · 2012年12月31日

发光二极管LED非相干宽带腔增强吸收光谱技术对大气HONO的定量方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

VEGF介导骨髓来源的内皮祖细胞修复急性肺损伤微血管组织及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Feature Structure Distillation with Centered Kernel Alignment in BERT Transferring

Arxiv

0+阅读 · 2023年2月27日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

Data-Free Knowledge Transfer: A Survey

Arxiv

21+阅读 · 2021年12月31日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

VIP会员

文章信息

相关主题

Student Networks

相关VIP内容

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

【如何做研究】How to research ，22页ppt

【如何做研究】How to research ，22页ppt

专知会员服务

112+阅读 · 2021年4月17日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

近期必读的6篇 NeurIPS 2019 的零样本学习(Zero-Shot Learning)论文

专知会员服务

60+阅读 · 2019年12月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

BERT/注意力机制/Transformer/迁移学习NLP资源大列表：awesome-bert-nlp

AINLP

40+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Deep Compression/Acceleration：模型压缩加速论文汇总

Deep Compression/Acceleration：模型压缩加速论文汇总

极市平台

14+阅读 · 2019年5月15日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

相关论文

Feature Structure Distillation with Centered Kernel Alignment in BERT Transferring

Arxiv

0+阅读 · 2023年2月27日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

Data-Free Knowledge Transfer: A Survey

Arxiv

21+阅读 · 2021年12月31日

Masked Autoencoders Are Scalable Vision Learners

Arxiv

27+阅读 · 2021年11月11日

Adaptive Transfer Learning on Graph Neural Networks

Arxiv

14+阅读 · 2021年7月20日

Transformer Tracking

Arxiv

17+阅读 · 2021年3月29日

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

Arxiv

23+阅读 · 2021年3月3日

PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval

Arxiv

11+阅读 · 2020年10月20日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

Weakly Supervised One-Shot Detection with Attention Siamese Networks

Arxiv

14+阅读 · 2018年1月12日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

浅水湖泊草源型有色可溶有机物（CDOM）的生消过程特征分析

国家自然科学基金

0+阅读 · 2015年12月31日

HOTAIR/miR-326/SP1调控通路对非小细胞肺癌增殖、迁移和侵袭能力的影响及作用机制

国家自然科学基金

0+阅读 · 2015年12月31日

HOXA5通过CHOP介导的凋亡途径抑制胆管癌的增殖作用研究

国家自然科学基金

0+阅读 · 2015年12月31日

重离子储存环CSRe上激光冷却相对论能量类锂12C3+离子束的实验研究

国家自然科学基金

0+阅读 · 2015年12月31日

HONO来源对京津冀大气中二次有机气溶胶浓度影响的模拟研究

国家自然科学基金

0+阅读 · 2013年12月31日

PGE2/EP2介导间充质干细胞向急性肺损伤肺组织归巢的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

新型核酸适体-siRNA嵌合体的靶向增敏机制及前列腺癌靶向治疗的研究

国家自然科学基金

0+阅读 · 2012年12月31日

发光二极管LED非相干宽带腔增强吸收光谱技术对大气HONO的定量方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

VEGF介导骨髓来源的内皮祖细胞修复急性肺损伤微血管组织及机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员