Zipf 标签滑动, 高效的一过自蒸馏 (Efficient One Pass Self-distillation with Zipf's Label Smoothing) - 专知论文

会员服务 ·

0

平滑 · 标注 · 模型评估 · Networking · SOFT ·

2022 年 7 月 26 日

Efficient One Pass Self-distillation with Zipf's Label Smoothing

翻译：Zipf 标签滑动, 高效的一过自蒸馏

Jiajun Liang,Linze Li,Zhaodong Bing,Borui Zhao,Yao Tang,Bo Lin,Haoqiang Fan

from arxiv, Accepted by ECCV2022

Self-distillation exploits non-uniform soft supervision from itself during training and improves performance without any runtime cost. However, the overhead during training is often overlooked, and yet reducing time and memory overhead during training is increasingly important in the giant models' era. This paper proposes an efficient self-distillation method named Zipf's Label Smoothing (Zipf's LS), which uses the on-the-fly prediction of a network to generate soft supervision that conforms to Zipf distribution without using any contrastive samples or auxiliary parameters. Our idea comes from an empirical observation that when the network is duly trained the output values of a network's final softmax layer, after sorting by the magnitude and averaged across samples, should follow a distribution reminiscent to Zipf's Law in the word frequency statistics of natural languages. By enforcing this property on the sample level and throughout the whole training period, we find that the prediction accuracy can be greatly improved. Using ResNet50 on the INAT21 fine-grained classification dataset, our technique achieves +3.61% accuracy gain compared to the vanilla baseline, and 0.88% more gain against the previous label smoothing or self-distillation strategies. The implementation is publicly available at https://github.com/megvii-research/zipfls.

翻译：自我蒸馏利用培训过程中的不统一软性监督,在不花费任何运行时间的情况下提高性能。但是,培训期间的间接费用往往被忽视,而培训期间的减少时间和记忆管理在巨型模型时代越来越重要。本文建议了一种名为 Zipf 的 Label 光滑( Zipf 的 LS) 的有效自我蒸馏方法, 使用网络的实时预测, 产生与Zipf 分布相一致的软性监督, 而没有使用任何对比样本或辅助参数。我们的想法来自一项经验观测, 即当网络在对网络最后软质层的产出值进行适当培训时, 在按照大小和平均的样本排序后, 培训过程中的时间和记忆管理时间和记忆管理在培训过程中越来越重要。通过在样本级别和整个培训期间执行这一特性, 我们发现预测的准确性可以大大改进。在INAT21 精细分类数据集上使用 ResNet50, 我们的技术在对网络最后软质层层进行适当培训后, 经过对网络最后软质层的输出值进行校准后, 在自然语言的频率统计上实现了+3.81%/revb 的精确度, 我们的技术在前的自我定位上取得了上, 。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

Sestrin2/AMPK信号通路调控新生鼠缺氧缺血脑损伤细胞自噬的新机制

国家自然科学基金

0+阅读 · 2015年12月31日

CSE1L在神经母细胞瘤发展中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Sip1干预BMP信号活性对少突胶质前体细胞移植治疗脊髓损伤的影响

国家自然科学基金

0+阅读 · 2013年12月31日

基于OCT-RNFL的MCI-AD个体化筛查模型的建立和验证

国家自然科学基金

0+阅读 · 2012年12月31日

舰船动力装置振动信号的超小波自适应分解与多重损伤特征定量识别

国家自然科学基金

0+阅读 · 2012年12月31日

针刺干预内质网应激调节脑缺血再灌注大鼠神经细胞自噬的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

LSCLS与DCG协同靶向治疗非小细胞肺癌研究

国家自然科学基金

0+阅读 · 2011年12月31日

用外显子组捕获测序技术鉴定Olmsted型掌跖角化症的致病基因

国家自然科学基金

0+阅读 · 2011年12月31日

miR-124和miR-27对阿尔茨海默病BACE1基因影响的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

SLRNet: Semi-Supervised Semantic Segmentation Via Label Reuse for Human Decomposition Images

Arxiv

0+阅读 · 2022年9月19日

Dataset Inference for Self-Supervised Models

Arxiv

0+阅读 · 2022年9月16日

One-Shot Synthesis of Images and Segmentation Masks

Arxiv

0+阅读 · 2022年9月15日

Hydra Attention: Efficient Attention with Many Heads

Arxiv

0+阅读 · 2022年9月15日

A Spatiotemporal Model for Precise and Efficient Fully-automatic 3D Motion Correction in OCT

Arxiv

0+阅读 · 2022年9月15日

Active Self-Training for Weakly Supervised 3D Scene Semantic Segmentation

Arxiv

0+阅读 · 2022年9月15日

Learning from Future: A Novel Self-Training Framework for Semantic Segmentation

Arxiv

0+阅读 · 2022年9月15日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

抢鲜看！13篇CVPR2020论文链接/开源代码/解读

专知会员服务

50+阅读 · 2020年2月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

67+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

【论文推荐】最新五篇命名实体识别相关论文—深度主动学习、Lattice LSTM、混合马尔可夫CRF

专知

26+阅读 · 2018年5月22日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

相关论文

SLRNet: Semi-Supervised Semantic Segmentation Via Label Reuse for Human Decomposition Images

Arxiv

0+阅读 · 2022年9月19日

Dataset Inference for Self-Supervised Models

Arxiv

0+阅读 · 2022年9月16日

One-Shot Synthesis of Images and Segmentation Masks

Arxiv

0+阅读 · 2022年9月15日

Hydra Attention: Efficient Attention with Many Heads

Arxiv

0+阅读 · 2022年9月15日

A Spatiotemporal Model for Precise and Efficient Fully-automatic 3D Motion Correction in OCT

Arxiv

0+阅读 · 2022年9月15日

Active Self-Training for Weakly Supervised 3D Scene Semantic Segmentation

Arxiv

0+阅读 · 2022年9月15日

Learning from Future: A Novel Self-Training Framework for Semantic Segmentation

Arxiv

0+阅读 · 2022年9月15日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

Sestrin2/AMPK信号通路调控新生鼠缺氧缺血脑损伤细胞自噬的新机制

国家自然科学基金

0+阅读 · 2015年12月31日

CSE1L在神经母细胞瘤发展中的作用及分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

Sip1干预BMP信号活性对少突胶质前体细胞移植治疗脊髓损伤的影响

国家自然科学基金

0+阅读 · 2013年12月31日

基于OCT-RNFL的MCI-AD个体化筛查模型的建立和验证

国家自然科学基金

0+阅读 · 2012年12月31日

舰船动力装置振动信号的超小波自适应分解与多重损伤特征定量识别

国家自然科学基金

0+阅读 · 2012年12月31日

针刺干预内质网应激调节脑缺血再灌注大鼠神经细胞自噬的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

LSCLS与DCG协同靶向治疗非小细胞肺癌研究

国家自然科学基金

0+阅读 · 2011年12月31日

用外显子组捕获测序技术鉴定Olmsted型掌跖角化症的致病基因

国家自然科学基金

0+阅读 · 2011年12月31日

miR-124和miR-27对阿尔茨海默病BACE1基因影响的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员