外课外学习:除经验分配外的知识转让 (Extracurricular Learning: Knowledge Transfer Beyond Empirical Distribution) - 专知论文

会员服务 ·

0

经验分布 · 模型评估 · MoDELS · 蒸馏 · 可约的 ·

2020 年 11 月 20 日

Extracurricular Learning: Knowledge Transfer Beyond Empirical Distribution

翻译：外课外学习:除经验分配外的知识转让

Hadi Pouransari,Mojan Javaheripi,Vinay Sharma,Oncel Tuzel

Knowledge distillation has been used to transfer knowledge learned by a sophisticated model (teacher) to a simpler model (student). This technique is widely used to compress model complexity. However, in most applications the compressed student model suffers from an accuracy gap with its teacher. We propose extracurricular learning, a novel knowledge distillation method, that bridges this gap by (1) modeling student and teacher output distributions; (2) sampling examples from an approximation to the underlying data distribution; and (3) matching student and teacher output distributions over this extended set including uncertain samples. We conduct rigorous evaluations on regression and classification tasks and show that compared to the standard knowledge distillation, extracurricular learning reduces the gap by 46% to 68%. This leads to major accuracy improvements compared to the empirical risk minimization-based training for various recent neural network architectures: 16% regression error reduction on the MPIIGaze dataset, +3.4% to +9.1% improvement in top-1 classification accuracy on the CIFAR100 dataset, and +2.9% top-1 improvement on the ImageNet dataset.

翻译：利用知识蒸馏法将精密模型(教师)所学知识传授给更简单的模型(学生),这一技术被广泛用于压缩模型复杂性。然而,在大多数应用中,压缩学生模型与其教师存在准确性差距。我们提议了一种新颖的知识蒸馏法,即课外学习,这是一种新颖的知识蒸馏法,通过(1) 模拟师生产出分配;(2) 从近似到基本数据分配的抽样实例;(3) 将学生和教师产出分配与包括不确定样本在内的这一扩展集相匹配。我们严格评估回归和分类任务,并表明与标准知识蒸馏相比,校外学习将差距缩小46%至68%。这导致与最近各种神经网络结构的实验性风险最小化培训相比,准确性显著提高:MPIIGaze数据集的回归误差减少16%,CIFAR100数据集上层1的精确度提高+3.4%至+9.1%,图像网络数据集上层改善2.9%。

0

相关内容

经验分布

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

专知会员服务

29+阅读 · 2020年4月17日

【伯克利】最新《深度半监督学习》总述，146页ppt，Semi-Supervised Learning

【伯克利】最新《深度半监督学习》总述，146页ppt，Semi-Supervised Learning

专知会员服务

147+阅读 · 2020年4月11日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知会员服务

95+阅读 · 2020年3月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【论文|迁移自适应学习综述】Transfer Adaptation Learning: A Decade Survey

【论文|迁移自适应学习综述】Transfer Adaptation Learning: A Decade Survey

专知会员服务

45+阅读 · 2019年11月26日

【中科院计算所】迁移学习全面综述论文，A Comprehensive Survey on Transfer Learning，27页pdf，171篇参考文献

【中科院计算所】迁移学习全面综述论文，A Comprehensive Survey on Transfer Learning，27页pdf，171篇参考文献

专知会员服务

99+阅读 · 2019年11月11日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Towards Maximizing the Representation Gap between In-Domain & Out-of-Distribution Examples

Towards Maximizing the Representation Gap between In-Domain & Out-of-Distribution Examples

Arxiv

0+阅读 · 2021年1月6日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Arxiv

14+阅读 · 2020年3月24日

已删除

Arxiv

32+阅读 · 2020年3月23日

DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

Arxiv

6+阅读 · 2020年3月17日

Continual Unsupervised Representation Learning

Continual Unsupervised Representation Learning

Arxiv

7+阅读 · 2019年10月31日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

TVAE: Triplet-Based Variational Autoencoder using Metric Learning

Arxiv

3+阅读 · 2018年4月3日

Active Metric Learning for Supervised Classification

Arxiv

9+阅读 · 2018年3月28日

Learning Intrinsic Sparse Structures within Long Short-Term Memory

Arxiv

4+阅读 · 2018年1月30日

VIP会员

文章信息

相关主题

相关VIP内容

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

最新《自然语言处理迁移学习》综述论文，A Survey on Transfer Learning in Natural Language Processing

专知会员服务

139+阅读 · 2020年7月10日

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

【CVPR2020-浙江大学-阿里巴巴】深层知识迁移的深层归因图，DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

专知会员服务

29+阅读 · 2020年4月17日

【伯克利】最新《深度半监督学习》总述，146页ppt，Semi-Supervised Learning

【伯克利】最新《深度半监督学习》总述，146页ppt，Semi-Supervised Learning

专知会员服务

147+阅读 · 2020年4月11日

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

图卷积神经网络蒸馏知识，Distillating Knowledge from GCN

专知会员服务

95+阅读 · 2020年3月25日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

【论文|迁移自适应学习综述】Transfer Adaptation Learning: A Decade Survey

【论文|迁移自适应学习综述】Transfer Adaptation Learning: A Decade Survey

专知会员服务

45+阅读 · 2019年11月26日

【中科院计算所】迁移学习全面综述论文，A Comprehensive Survey on Transfer Learning，27页pdf，171篇参考文献

【中科院计算所】迁移学习全面综述论文，A Comprehensive Survey on Transfer Learning，27页pdf，171篇参考文献

专知会员服务

99+阅读 · 2019年11月11日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

大语言模型驱动的AI智能体通信综述：协议、安全风险与防御对策

基于BERT和知识图谱的武器装备问答系统

中文版4000字 | 战场人工智能革命尚未到来：当前俄乌AI无人机发展现状

中文版 | 转向防务：硅谷如何谋划接管战争产业

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Towards Maximizing the Representation Gap between In-Domain & Out-of-Distribution Examples

Towards Maximizing the Representation Gap between In-Domain & Out-of-Distribution Examples

Arxiv

0+阅读 · 2021年1月6日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Arxiv

14+阅读 · 2020年3月24日

已删除

Arxiv

32+阅读 · 2020年3月23日

DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

Arxiv

6+阅读 · 2020年3月17日

Continual Unsupervised Representation Learning

Continual Unsupervised Representation Learning

Arxiv

7+阅读 · 2019年10月31日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

TVAE: Triplet-Based Variational Autoencoder using Metric Learning

Arxiv

3+阅读 · 2018年4月3日

Active Metric Learning for Supervised Classification

Arxiv

9+阅读 · 2018年3月28日

Learning Intrinsic Sparse Structures within Long Short-Term Memory

Arxiv

4+阅读 · 2018年1月30日

微信扫码咨询专知VIP会员