知识蒸馏中“良好”数据增加的“良好”数据增加——统计视角 (What Makes a "Good" Data Augmentation in Knowledge Distillation -- A Statistical Perspective) - 专知论文

会员服务 ·

0

统计量 · 知识 (knowledge) · Better · 蒸馏 · Performer ·

2023 年 2 月 21 日

What Makes a "Good" Data Augmentation in Knowledge Distillation -- A Statistical Perspective

翻译：知识蒸馏中“良好”数据增加的“良好”数据增加——统计视角

Huan Wang,Suhas Lohit,Mike Jones,Yun Fu

from arxiv, Camera Ready of NeurIPS'22. Code: https://github.com/MingSun-Tse/Good-DA-in-KD

Knowledge distillation (KD) is a general neural network training approach that uses a teacher model to guide the student model. Existing works mainly study KD from the network output side (e.g., trying to design a better KD loss function), while few have attempted to understand it from the input side. Especially, its interplay with data augmentation (DA) has not been well understood. In this paper, we ask: Why do some DA schemes (e.g., CutMix) inherently perform much better than others in KD? What makes a "good" DA in KD? Our investigation from a statistical perspective suggests that a good DA scheme should reduce the covariance of the teacher-student cross-entropy. A practical metric, the stddev of teacher's mean probability (T. stddev), is further presented and well justified empirically. Besides the theoretical understanding, we also introduce a new entropy-based data-mixing DA scheme, CutMixPick, to further enhance CutMix. Extensive empirical studies support our claims and demonstrate how we can harvest considerable performance gains simply by using a better DA scheme in knowledge distillation.

翻译：知识蒸馏( KD) 是一种使用教师模型指导学生模型的一般神经网络培训方法。现有的工作主要是从网络输出方面研究KD( 试图设计更好的KD损失函数), 虽然很少有人试图从输入方面了解KD 。特别是, 它与数据增强( DA) 的相互作用还没有得到很好理解。在本文中, 我们问: 为什么一些DA( 例如, CutMix) 计划本身比KD 中的其他计划效果要好得多? 是什么使得KD 中的DA“ 好”? 我们从统计角度进行的调查表明, 良好的DA 计划应该减少教师- 学生跨元素的共变。一个实用的衡量标准, 教师潜在概率( T. stdddev) 的标准, 进一步被提出, 并有充分的经验依据。除了理论上的理解, 我们还引入一个新的基于 entropy 数据混合的DA 计划, CutMixPick 来进一步加强 CutMix 。广泛的实证性研究支持我们的索赔, 并证明我们如何通过更好地运用DA 计划取得相当大的业绩成果。

0

相关内容

统计量

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

牛磺酸抑制AS肉鸡右心肥大过程中calpains介导细胞凋亡作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

控释VEGF/NT-3脊髓脱细胞支架在SCI模型中的血管化及神经再生研究

国家自然科学基金

0+阅读 · 2013年12月31日

Notch信号通路参与家蚕胚胎发育分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向LDH-A能量代谢对T细胞急性淋巴细胞白血病的抗白血病效应及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

调控Beclin1/Atg7等介导的自噬对胰岛ε/β细胞分化失衡的影响及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

TGF-β1-RhoA/ROCK1信号通路在高糖诱导肾小球内皮细胞-间充质细胞转分化中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

TGF-β28608;活Myocardin家族诱导骨髓间充质干细胞分化的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Pygo2介导 Wnt信号的分子机理和在乳腺/乳腺癌干细胞增殖中的作用

国家自然科学基金

0+阅读 · 2008年12月31日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces

Arxiv

18+阅读 · 2022年11月7日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

Data-Free Knowledge Transfer: A Survey

Arxiv

21+阅读 · 2021年12月31日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommendation

Arxiv

11+阅读 · 2019年6月13日

VIP会员

文章信息

相关主题

知识 (knowledge)

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Arxiv

11+阅读 · 2023年3月10日

Knowledge Graph Embedding: A Survey from the Perspective of Representation Spaces

Arxiv

18+阅读 · 2022年11月7日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Arxiv

21+阅读 · 2022年9月21日

Data-Free Knowledge Transfer: A Survey

Arxiv

21+阅读 · 2021年12月31日

Activation Modulation and Recalibration Scheme for Weakly Supervised Semantic Segmentation

Arxiv

12+阅读 · 2021年12月16日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

15+阅读 · 2020年2月25日

Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommendation

Arxiv

11+阅读 · 2019年6月13日

相关基金

牛磺酸抑制AS肉鸡右心肥大过程中calpains介导细胞凋亡作用的研究

国家自然科学基金

0+阅读 · 2015年12月31日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

控释VEGF/NT-3脊髓脱细胞支架在SCI模型中的血管化及神经再生研究

国家自然科学基金

0+阅读 · 2013年12月31日

Notch信号通路参与家蚕胚胎发育分子机制的研究

国家自然科学基金

0+阅读 · 2013年12月31日

靶向LDH-A能量代谢对T细胞急性淋巴细胞白血病的抗白血病效应及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

调控Beclin1/Atg7等介导的自噬对胰岛ε/β细胞分化失衡的影响及其机制

国家自然科学基金

0+阅读 · 2012年12月31日

TGF-β1-RhoA/ROCK1信号通路在高糖诱导肾小球内皮细胞-间充质细胞转分化中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

Curcumin双向调控HO-1/HO-2协同抑制Aβeme复合物防治AD的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

TGF-β28608;活Myocardin家族诱导骨髓间充质干细胞分化的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Pygo2介导 Wnt信号的分子机理和在乳腺/乳腺癌干细胞增殖中的作用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员