共同咨询:跨感性双感性生物蒸馏 (Co-advise: Cross Inductive Bias Distillation) - 专知论文

会员服务 ·

0

归纳偏好 · 有偏 · 蒸馏 · 变换 · Vision ·

2021 年 6 月 23 日

Co-advise: Cross Inductive Bias Distillation

翻译：共同咨询:跨感性双感性生物蒸馏

Sucheng Ren,Zhengqi Gao,Tianyu Hua,Zihui Xue,Yonglong Tian,Shengfeng He,Hang Zhao

Transformers recently are adapted from the community of natural language processing as a promising substitute of convolution-based neural networks for visual learning tasks. However, its supremacy degenerates given an insufficient amount of training data (e.g., ImageNet). To make it into practical utility, we propose a novel distillation-based method to train vision transformers. Unlike previous works, where merely heavy convolution-based teachers are provided, we introduce lightweight teachers with different architectural inductive biases (e.g., convolution and involution) to co-advise the student transformer. The key is that teachers with different inductive biases attain different knowledge despite that they are trained on the same dataset, and such different knowledge compounds and boosts the student's performance during distillation. Equipped with this cross inductive bias distillation method, our vision transformers (termed as CivT) outperform all previous transformers of the same architecture on ImageNet.

翻译：最近从自然语言处理社区改编的变异器最近被改编为以自然语言处理社区为视觉学习任务的有希望的替代以革命为基础的神经网络。然而,由于培训数据(如图像网络)数量不足,其至高无上的地位就退化了。为了使它成为实用的实用性,我们提议了一种基于蒸馏的新颖方法来培训视觉变异器。与以前只提供重革命教师的工程不同,我们引入了具有不同建筑感官偏见(如混凝土和进化)的轻质教师来共同咨询学生变异器。关键是,尽管有不同感官偏见的教师接受过相同的数据集培训,但获得不同的知识,以及这种不同的知识化合物和提升学生在蒸馏过程中的成绩。用这种交叉感带偏见蒸馏法将我们的视觉变异器(称为CivT)超越了图像网络上所有先前的变异器。

0

相关内容

归纳偏好

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

《多任务学习》最新综述论文，20页pdf

《多任务学习》最新综述论文，20页pdf

专知会员服务

125+阅读 · 2021年4月6日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

最新【图神经网络计算】2020综述论文，23页PDF

最新【图神经网络计算】2020综述论文，23页PDF

专知会员服务

193+阅读 · 2020年10月3日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【DeepMind】CrossTransformers: 空间感知的小样本迁移

【DeepMind】CrossTransformers: 空间感知的小样本迁移

专知会员服务

40+阅读 · 2020年7月26日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

解读自监督学习(Self-Supervised Learning)几篇相关paper

解读自监督学习(Self-Supervised Learning)几篇相关paper

CVer

25+阅读 · 2020年2月21日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

Learning From Long-Tailed Data With Noisy Labels

Arxiv

0+阅读 · 2021年8月25日

Bridging Unsupervised and Supervised Depth from Focus via All-in-Focus Supervision

Arxiv

0+阅读 · 2021年8月24日

Exploring Data Aggregation and Transformations to Generalize across Visual Domains

Arxiv

0+阅读 · 2021年8月20日

Visual Distant Supervision for Scene Graph Generation

Visual Distant Supervision for Scene Graph Generation

Arxiv

1+阅读 · 2021年8月20日

ROD: Reception-aware Online Distillation for Sparse Graphs

Arxiv

7+阅读 · 2021年7月25日

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

Arxiv

13+阅读 · 2020年7月3日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Arxiv

13+阅读 · 2020年4月13日

Unsupervised Domain Adaptation on Reading Comprehension

Arxiv

5+阅读 · 2019年11月13日

Knowledge Distillation from Internal Representations

Knowledge Distillation from Internal Representations

Arxiv

4+阅读 · 2019年10月8日

Invariant Information Distillation for Unsupervised Image Segmentation and Clustering

Invariant Information Distillation for Unsupervised Image Segmentation and Clustering

Arxiv

5+阅读 · 2018年7月21日

VIP会员

文章信息

相关主题

相关VIP内容

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

《多任务学习》最新综述论文，20页pdf

《多任务学习》最新综述论文，20页pdf

专知会员服务

125+阅读 · 2021年4月6日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

最新【图神经网络计算】2020综述论文，23页PDF

最新【图神经网络计算】2020综述论文，23页PDF

专知会员服务

193+阅读 · 2020年10月3日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【DeepMind】CrossTransformers: 空间感知的小样本迁移

【DeepMind】CrossTransformers: 空间感知的小样本迁移

专知会员服务

40+阅读 · 2020年7月26日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

43+阅读 · 2019年11月25日

热门VIP内容

开通专知VIP会员享更多权益服务

《使用量化测量将传感器节点关联到融合中心的算法设计》171页

军事前沿模型

提升军事训练能力的最佳人工智能模拟工具

《社交媒体信息作战》最新48页技术报告

相关资讯

解读自监督学习(Self-Supervised Learning)几篇相关paper

解读自监督学习(Self-Supervised Learning)几篇相关paper

CVer

25+阅读 · 2020年2月21日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

相关论文

Learning From Long-Tailed Data With Noisy Labels

Arxiv

0+阅读 · 2021年8月25日

Bridging Unsupervised and Supervised Depth from Focus via All-in-Focus Supervision

Arxiv

0+阅读 · 2021年8月24日

Exploring Data Aggregation and Transformations to Generalize across Visual Domains

Arxiv

0+阅读 · 2021年8月20日

Visual Distant Supervision for Scene Graph Generation

Visual Distant Supervision for Scene Graph Generation

Arxiv

1+阅读 · 2021年8月20日

ROD: Reception-aware Online Distillation for Sparse Graphs

Arxiv

7+阅读 · 2021年7月25日

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

Arxiv

13+阅读 · 2020年7月3日

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks

Arxiv

13+阅读 · 2020年4月13日

Unsupervised Domain Adaptation on Reading Comprehension

Arxiv

5+阅读 · 2019年11月13日

Knowledge Distillation from Internal Representations

Knowledge Distillation from Internal Representations

Arxiv

4+阅读 · 2019年10月8日

Invariant Information Distillation for Unsupervised Image Segmentation and Clustering

Invariant Information Distillation for Unsupervised Image Segmentation and Clustering

Arxiv

5+阅读 · 2018年7月21日

微信扫码咨询专知VIP会员