使用动态重加权最小化 (Sharpness-Aware Minimization with Dynamic Reweighting) - 专知论文

会员服务 ·

0

Weight · 泛化理论 · 平坦最小值 · 极小值 · 正则化项 ·

2022 年 5 月 25 日

Sharpness-Aware Minimization with Dynamic Reweighting

翻译：使用动态重加权最小化

Wenxuan Zhou,Fangyu Liu,Huan Zhang,Muhao Chen

Deep neural networks are often overparameterized and may not easily achieve model generalization. Adversarial training has shown effectiveness in improving generalization by regularizing the change of loss on top of adversarially chosen perturbations. The recently proposed sharpness-aware minimization (SAM) algorithm conducts adversarial weight perturbation, encouraging the model to converge to a flat minima. SAM finds a common adversarial weight perturbation per-batch. Although per-instance adversarial weight perturbations are stronger adversaries and they can potentially lead to better generalization performance, their computational cost is very high and thus it is impossible to use per-instance perturbations efficiently in SAM. In this paper, we tackle this efficiency bottleneck and propose sharpness-aware minimization with dynamic reweighting ({\delta}-SAM). Our theoretical analysis motivates that it is possible to approach the stronger, per-instance adversarial weight perturbations using reweighted per-batch weight perturbations. {\delta}-SAM dynamically reweights perturbation within each batch according to the theoretically principled weighting factors, serving as a good approximation to per-instance perturbation. Experiments on various natural language understanding tasks demonstrate the effectiveness of {\delta}-SAM.

翻译：深心神经网络往往被过分地夸大,可能不易实现典型的概括化。反向培训通过在对抗性选择的扰动上将损失变化常规化,表明在改进一般化方面的效力。最近提议的敏锐觉知觉最小化(SAM)算法是对抗性重量的振动,鼓励模型向一个平坦的微缩。 SAM发现一个常见的对抗性重量对称扰动每批。虽然对称性对称重量扰动是较强的对手,而且它们有可能导致更好的一般化性能,但其计算成本非常高,因此无法在SAM高效地使用每强度扰动干扰。在本文中,我们处理这种效率瓶颈,提出以动态再加权( thdelta}-SAM) 最小化敏锐觉悟性最小化。我们的理论分析表明,有可能利用重新加权的重量来接近较强的对称称的对称重的对称重的对称重量,它们可能会导致更好的超重性性性能。 {TAND - 动态分析无法在每份上有效地使用一次对称性对正统性调整的逻辑性调整, 。

0

相关内容

Weight

【阿姆斯特丹大学Erik J Bekkers】群等变深度学习介绍，Introduction to group equivariant deep learning

【阿姆斯特丹大学Erik J Bekkers】群等变深度学习介绍，Introduction to group equivariant deep learning

专知会员服务

27+阅读 · 2022年3月27日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

基于GMDH动态聚类集成的应用商店客户价值细分研究

国家自然科学基金

1+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

末次盛冰期以来我国不同气候区植物多样性变化研究—以云南和黄土高原为例

国家自然科学基金

0+阅读 · 2014年12月31日

HOXB-AS3/HOXB7/PAK4信号轴调控结直肠癌侵袭转移的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

致癌物NNKⅠ,Ⅱ相代谢酶CYP2A13,UGT2B17和Ⅲ相ABC转运体基因多态与肺癌的协同关联及FOXA2介导的共调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于正则Vine copula的相依建模及软件开发

国家自然科学基金

0+阅读 · 2012年12月31日

一类拟线性Schrodinger方程(组)解的存在性和集中现象研究

国家自然科学基金

0+阅读 · 2012年12月31日

我国儿童健康状况与健康不平等的经济学分析

国家自然科学基金

0+阅读 · 2011年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

Normalized gradient flow optimization in the training of ReLU artificial neural networks

Arxiv

0+阅读 · 2022年7月13日

A for-loop is all you need. For solving the inverse problem in the case of personalized tumor growth modeling

Arxiv

0+阅读 · 2022年7月11日

Repairing Neural Networks by Leaving the Right Past Behind

Arxiv

0+阅读 · 2022年7月11日

Semi-Structured Distributional Regression -- Extending Structured Additive Models by Arbitrary Deep Neural Networks and Data Modalities

Arxiv

0+阅读 · 2022年7月9日

Neuro-Inspired Deep Neural Networks with Sparse, Strong Activations

Arxiv

0+阅读 · 2022年7月8日

Balanced Self-Paced Learning for AUC Maximization

Arxiv

0+阅读 · 2022年7月8日

Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial System Applications

Arxiv

26+阅读 · 2021年10月5日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

平坦最小值

相关VIP内容

【阿姆斯特丹大学Erik J Bekkers】群等变深度学习介绍，Introduction to group equivariant deep learning

【阿姆斯特丹大学Erik J Bekkers】群等变深度学习介绍，Introduction to group equivariant deep learning

专知会员服务

27+阅读 · 2022年3月27日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

社交网络上议题社群的公共焦虑研究，中国人民大学新闻学院塔娜讲师，第八届全国社会媒体处理大会SMP2019

专知会员服务

15+阅读 · 2019年10月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《乌克兰无人机产业：志愿者与政策在构建新兴无人机产业中的协同作用》最新报告

《人工智能辅助决策中的数据可视化：系统性综述》

人工智能驱动弹药制造现代化：美国陆军转型之路

《敏捷作战部署中枢纽-辐条基地选址优化研究》80页

相关资讯

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

相关论文

Normalized gradient flow optimization in the training of ReLU artificial neural networks

Arxiv

0+阅读 · 2022年7月13日

A for-loop is all you need. For solving the inverse problem in the case of personalized tumor growth modeling

Arxiv

0+阅读 · 2022年7月11日

Repairing Neural Networks by Leaving the Right Past Behind

Arxiv

0+阅读 · 2022年7月11日

Semi-Structured Distributional Regression -- Extending Structured Additive Models by Arbitrary Deep Neural Networks and Data Modalities

Arxiv

0+阅读 · 2022年7月9日

Neuro-Inspired Deep Neural Networks with Sparse, Strong Activations

Arxiv

0+阅读 · 2022年7月8日

Balanced Self-Paced Learning for AUC Maximization

Arxiv

0+阅读 · 2022年7月8日

Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial System Applications

Arxiv

26+阅读 · 2021年10月5日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Learning to Learn and Predict: A Meta-Learning Approach for Multi-Label Classification

Arxiv

17+阅读 · 2019年9月9日

End-to-End Multi-Task Learning with Attention

Arxiv

19+阅读 · 2018年3月28日

相关基金

基于GMDH动态聚类集成的应用商店客户价值细分研究

国家自然科学基金

1+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

末次盛冰期以来我国不同气候区植物多样性变化研究—以云南和黄土高原为例

国家自然科学基金

0+阅读 · 2014年12月31日

HOXB-AS3/HOXB7/PAK4信号轴调控结直肠癌侵袭转移的分子机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

致癌物NNKⅠ,Ⅱ相代谢酶CYP2A13,UGT2B17和Ⅲ相ABC转运体基因多态与肺癌的协同关联及FOXA2介导的共调控机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于正则Vine copula的相依建模及软件开发

国家自然科学基金

0+阅读 · 2012年12月31日

一类拟线性Schrodinger方程(组)解的存在性和集中现象研究

国家自然科学基金

0+阅读 · 2012年12月31日

我国儿童健康状况与健康不平等的经济学分析

国家自然科学基金

0+阅读 · 2011年12月31日

Dirichlet空间的分析与几何

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员