知识蒸馏 (Confidence-Aware Multi-Teacher Knowledge Distillation) - 专知论文

会员服务 ·

0

蒸馏 · Extensibility · 多样性 · state-of-the-art · MoDELS ·

2022 年 2 月 11 日

Confidence-Aware Multi-Teacher Knowledge Distillation

翻译：知识蒸馏

Hailin Zhang,Defang Chen,Can Wang

from arxiv, 5 pages, 4 figure, 4 tables, submitted to ICASSP 2022

Knowledge distillation is initially introduced to utilize additional supervision from a single teacher model for the student model training. To boost the student performance, some recent variants attempt to exploit diverse knowledge sources from multiple teachers. However, existing studies mainly integrate knowledge from diverse sources by averaging over multiple teacher predictions or combining them using other various label-free strategies, which may mislead student in the presence of low-quality teacher predictions. To tackle this problem, we propose Confidence-Aware Multi-teacher Knowledge Distillation (CA-MKD), which adaptively assigns sample-wise reliability for each teacher prediction with the help of ground-truth labels, with those teacher predictions close to one-hot labels assigned large weights. Besides, CA-MKD incorporates intermediate layers to stable the knowledge transfer process. Extensive experiments show that our CA-MKD consistently outperforms all compared state-of-the-art methods across various teacher-student architectures.

翻译：最初引入了知识蒸馏法,以利用单一教师模式对学生模式培训的额外监督。为了提高学生的成绩,最近的一些变异物试图利用来自多个教师的不同知识来源。然而,现有的研究主要将来自不同来源的知识综合起来,平均超过多个教师预测,或采用其他各种无标签战略,这可能在低质量教师预测的情况下误导学生。为了解决这一问题,我们提议采用“信任软件”多教师知识蒸馏法(CA-MKD),该技术在地面真实标签的帮助下,对每个教师预测的样本和可靠性进行了适应性分配,而教师预测接近于一热标签,赋予了较大的重量。此外,CA-MKD将中间层纳入到稳定知识转移过程。广泛的实验表明,我们的CA-MKD在各种教师结构中始终超越了与最先进方法相比较的所有方法。

0

相关内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Survivin在低氧诱导喉癌淋巴管生成中的调控作用及其分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于质量管理的不确定性双向感性工学

国家自然科学基金

0+阅读 · 2014年12月31日

结核分枝杆菌蛋白MPT64和PstS1 T/B细胞抗原表位多态性和免疫功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

H1N1流感病毒血凝素异嗜性抗原表位分析及突变研究

国家自然科学基金

0+阅读 · 2012年12月31日

资源高效利用间歇共沸精馏系统集成优化研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于多视图能耗模型的能源调配与生产调度协同优化

国家自然科学基金

2+阅读 · 2012年12月31日

李斯特菌载体在增强丙型肝炎病毒重组多表位树突细胞疫苗中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于RFID/EPC技术的网络化单件生产实时监控

国家自然科学基金

0+阅读 · 2009年12月31日

抗原特异性和非抗原特异性CD4+CD25+ Treg细胞对Th1细胞分化、效应功能和记忆Th1细胞形成的影响

国家自然科学基金

0+阅读 · 2008年12月31日

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

Distilling and Transferring Knowledge via cGAN-generated Samples for Image Classification and Regression

Arxiv

0+阅读 · 2022年4月19日

A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

Arxiv

0+阅读 · 2022年4月19日

Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation

Arxiv

11+阅读 · 2021年12月9日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommendation

Arxiv

11+阅读 · 2019年6月13日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

17+阅读 · 2018年5月31日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

state-of-the-art

相关VIP内容

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Multi-Task Learning的几篇综述文章

Multi-Task Learning的几篇综述文章

深度学习自然语言处理

15+阅读 · 2020年6月15日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

K-LITE: Learning Transferable Visual Models with External Knowledge

Arxiv

2+阅读 · 2022年4月20日

Distilling and Transferring Knowledge via cGAN-generated Samples for Image Classification and Regression

Arxiv

0+阅读 · 2022年4月19日

A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

A Semi-supervised Learning Approach with Two Teachers to Improve Breakdown Identification in Dialogues

Arxiv

0+阅读 · 2022年4月19日

Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation

Arxiv

11+阅读 · 2021年12月9日

Few-shot Learning for Multi-label Intent Detection

Arxiv

21+阅读 · 2020年10月11日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommendation

Arxiv

11+阅读 · 2019年6月13日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

Arxiv

17+阅读 · 2018年5月31日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

相关基金

Survivin在低氧诱导喉癌淋巴管生成中的调控作用及其分子机制

国家自然科学基金

0+阅读 · 2015年12月31日

基于质量管理的不确定性双向感性工学

国家自然科学基金

0+阅读 · 2014年12月31日

结核分枝杆菌蛋白MPT64和PstS1 T/B细胞抗原表位多态性和免疫功能研究

国家自然科学基金

0+阅读 · 2013年12月31日

H1N1流感病毒血凝素异嗜性抗原表位分析及突变研究

国家自然科学基金

0+阅读 · 2012年12月31日

资源高效利用间歇共沸精馏系统集成优化研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于多视图能耗模型的能源调配与生产调度协同优化

国家自然科学基金

2+阅读 · 2012年12月31日

李斯特菌载体在增强丙型肝炎病毒重组多表位树突细胞疫苗中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

Cystatin B缺失与Prion疾病自噬作用机制的研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于RFID/EPC技术的网络化单件生产实时监控

国家自然科学基金

0+阅读 · 2009年12月31日

抗原特异性和非抗原特异性CD4+CD25+ Treg细胞对Th1细胞分化、效应功能和记忆Th1细胞形成的影响

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员