一名更强力教师的知识蒸馏 (Knowledge Distillation from A Stronger Teacher) - 专知论文

会员服务 ·

0

蒸馏 · state-of-the-art · 知识 (knowledge) · Performer · Extensibility ·

2022 年 12 月 28 日

Knowledge Distillation from A Stronger Teacher

翻译：一名更强力教师的知识蒸馏

Tao Huang,Shan You,Fei Wang,Chen Qian,Chang Xu

from arxiv, Accepted to NeurIPS 2022

Unlike existing knowledge distillation methods focus on the baseline settings, where the teacher models and training strategies are not that strong and competing as state-of-the-art approaches, this paper presents a method dubbed DIST to distill better from a stronger teacher. We empirically find that the discrepancy of predictions between the student and a stronger teacher may tend to be fairly severer. As a result, the exact match of predictions in KL divergence would disturb the training and make existing methods perform poorly. In this paper, we show that simply preserving the relations between the predictions of teacher and student would suffice, and propose a correlation-based loss to capture the intrinsic inter-class relations from the teacher explicitly. Besides, considering that different instances have different semantic similarities to each class, we also extend this relational match to the intra-class level. Our method is simple yet practical, and extensive experiments demonstrate that it adapts well to various architectures, model sizes and training strategies, and can achieve state-of-the-art performance consistently on image classification, object detection, and semantic segmentation tasks. Code is available at: https://github.com/hunto/DIST_KD .

翻译：与现有知识蒸馏方法不同的是,现有知识蒸馏方法侧重于基线设置,即教师模式和培训战略并不是作为最先进的方法而强大和相互竞争的,本文提出了一种称为DIST的方法,以更好地从更强的教师中提炼出。我们从经验中发现,学生与更强的教师之间的预测差异可能更为严重。结果,KL差异预测的准确匹配会干扰培训,使现有方法效果不佳。在本文件中,我们表明,仅仅保持师生预测之间的关系就足够了,并提出基于关联的损失,以明确从教师那里获取固有的阶级间关系。此外,考虑到不同的情况与每个班级的语义相似性,我们还将这种关系扩大到班级内部级别。我们的方法简单而实用,广泛的实验表明,它适应了各种结构、模型规模和培训战略,并能够实现图像分类、对象探测和语义分化任务方面的一贯状态性表现。代码可以查到: https://github.com/hunto。

0

相关内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于WorldView-3和OP-ELM的矿化蚀变提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

高垂直各向异性磁性材料的光热辅助磁矩反转动力学

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin抑制胸腺脂肪细胞生成的分子调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪因子chemerin通过ChemR23依赖性途径对动脉粥样硬化发生、发展和斑块稳定性影响及其作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

热红外甲烷廓线物理反演中稳定性机理分析和模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

靶向GRPR的铁蛋白笼状嵌合体多功能分子探针的构建及其PET/MRI/NIRF三模式显像的探索研究

国家自然科学基金

0+阅读 · 2011年12月31日

EGFR2单抗Herceptin修饰紫杉醇纳米胶束联合Survivin基因沉默靶向治疗鼻咽癌的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

天然沉积粘性土结构性压缩模型研究

国家自然科学基金

0+阅读 · 2008年12月31日

Active Learning of Non-semantic Speech Tasks with Pretrained Models

Arxiv

0+阅读 · 2023年2月25日

Ensemble knowledge distillation of self-supervised speech models

Arxiv

0+阅读 · 2023年2月24日

SGL-PT: A Strong Graph Learner with Graph Prompt Tuning

Arxiv

0+阅读 · 2023年2月24日

Practical Knowledge Distillation: Using DNNs to Beat DNNs

Arxiv

0+阅读 · 2023年2月23日

ACE: Zero-Shot Image to Image Translation via Pretrained Auto-Contrastive-Encoder

Arxiv

0+阅读 · 2023年2月22日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

Arxiv

13+阅读 · 2020年7月3日

Entity Context and Relational Paths for Knowledge Graph Completion

Arxiv

29+阅读 · 2020年2月17日

Commonsense Knowledge Base Completion with Structural and Semantic Context

Commonsense Knowledge Base Completion with Structural and Semantic Context

Arxiv

20+阅读 · 2019年12月19日

RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space

Arxiv

11+阅读 · 2019年2月26日

VIP会员

文章信息

相关主题

state-of-the-art

知识 (knowledge)

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

165+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

53+阅读 · 2019年9月29日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Active Learning of Non-semantic Speech Tasks with Pretrained Models

Arxiv

0+阅读 · 2023年2月25日

Ensemble knowledge distillation of self-supervised speech models

Arxiv

0+阅读 · 2023年2月24日

SGL-PT: A Strong Graph Learner with Graph Prompt Tuning

Arxiv

0+阅读 · 2023年2月24日

Practical Knowledge Distillation: Using DNNs to Beat DNNs

Arxiv

0+阅读 · 2023年2月23日

ACE: Zero-Shot Image to Image Translation via Pretrained Auto-Contrastive-Encoder

Arxiv

0+阅读 · 2023年2月22日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

Arxiv

13+阅读 · 2020年7月3日

Entity Context and Relational Paths for Knowledge Graph Completion

Arxiv

29+阅读 · 2020年2月17日

Commonsense Knowledge Base Completion with Structural and Semantic Context

Commonsense Knowledge Base Completion with Structural and Semantic Context

Arxiv

20+阅读 · 2019年12月19日

RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space

Arxiv

11+阅读 · 2019年2月26日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

靶向微管蛋白秋水仙碱位点的白藜芦醇-Combrestatin A-4类抑制剂的设计、合成及活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于WorldView-3和OP-ELM的矿化蚀变提取方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

高垂直各向异性磁性材料的光热辅助磁矩反转动力学

国家自然科学基金

0+阅读 · 2012年12月31日

Ghrelin抑制胸腺脂肪细胞生成的分子调控网络研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪因子chemerin通过ChemR23依赖性途径对动脉粥样硬化发生、发展和斑块稳定性影响及其作用机制的研究

国家自然科学基金

0+阅读 · 2012年12月31日

热红外甲烷廓线物理反演中稳定性机理分析和模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

靶向GRPR的铁蛋白笼状嵌合体多功能分子探针的构建及其PET/MRI/NIRF三模式显像的探索研究

国家自然科学基金

0+阅读 · 2011年12月31日

EGFR2单抗Herceptin修饰紫杉醇纳米胶束联合Survivin基因沉默靶向治疗鼻咽癌的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

天然沉积粘性土结构性压缩模型研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员