使用动态重加权最小化 (Sharpness-Aware Minimization with Dynamic Reweighting) - 专知论文

会员服务 ·

0

Weight · 泛化理论 · 平坦最小值 · MoDELS · motivation ·

2022 年 12 月 6 日

Sharpness-Aware Minimization with Dynamic Reweighting

翻译：使用动态重加权最小化

Wenxuan Zhou,Fangyu Liu,Huan Zhang,Muhao Chen

from arxiv, Findings of EMNLP 2022

Deep neural networks are often overparameterized and may not easily achieve model generalization. Adversarial training has shown effectiveness in improving generalization by regularizing the change of loss on top of adversarially chosen perturbations. The recently proposed sharpness-aware minimization (SAM) algorithm conducts adversarial weight perturbation, encouraging the model to converge to a flat minima. SAM finds a common adversarial weight perturbation per-batch. Although per-instance adversarial weight perturbations are stronger adversaries and can potentially lead to better generalization performance, their computational cost is very high and thus it is impossible to use per-instance perturbations efficiently in SAM. In this paper, we tackle this efficiency bottleneck and propose sharpness-aware minimization with dynamic reweighting (delta-SAM). Our theoretical analysis motivates that it is possible to approach the stronger, per-instance adversarial weight perturbations using reweighted per-batch weight perturbations. delta-SAM dynamically reweights perturbation within each batch according to the theoretically principled weighting factors, serving as a good approximation to per-instance perturbation. Experiments on various natural language understanding tasks demonstrate the effectiveness of delta-SAM.

翻译：深海内心网络往往被过分地过分测量,而且可能不易实现典型的概括化。反向培训通过在对抗性选择的扰动上使损失变化的改变正规化,表明在改进一般化方面的效力。最近提出的尖锐的觉醒最小化(SAM)算法是对抗性重量的振动,鼓励该模型与一个平坦的微粒相融合。SAM发现一种常见的对抗性重量按部就班地对齐。虽然每份的敌对性重量扰动是较强的对手,并有可能导致更好的一般化性能,但其计算成本非常高,因此无法在SAM高效地使用每份强度扰动干扰。在本文件中,我们处理这一效率瓶颈问题,提出以动态的再加权(delta-SAM)方式将敏锐度最小化。我们的理论分析表明,有可能利用每份重量重的重量再加权的对抗性重量来接近较强的对抗性重量对扰动。三角-SAM动态重的计算成本非常高,因此无法在每份内高效地使用每份上对质的精确度进行精确分析。

0

相关内容

Weight

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

地西他滨诱导人调节性gammadeltaT细胞Foxp3基因表达上调的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

图论中的整数流与圆流

国家自然科学基金

0+阅读 · 2015年12月31日

蒲黄生行熟止的物质基础及质量评价

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA-TUSC7在胃癌中的抑癌作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

uPA/uPAR在不伴息肉型慢性鼻-鼻窦炎病理过程中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

DWI监测RNAi沉默AQP4治疗脑缺血半暗带的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

大肠癌组织样本的蛋白质组学和转录组学整合研究

国家自然科学基金

0+阅读 · 2009年12月31日

Mitigating Algorithmic Bias with Limited Annotations

Arxiv

0+阅读 · 2023年2月7日

Self-Sampling Training and Evaluation for the Accuracy-Bias Tradeoff in Recommendation

Arxiv

0+阅读 · 2023年2月7日

Sparse GEMINI for Joint Discriminative Clustering and Feature Selection

Arxiv

0+阅读 · 2023年2月7日

On the Ideal Number of Groups for Isometric Gradient Propagation

Arxiv

0+阅读 · 2023年2月7日

Adaptive Parameterization of Deep Learning Models for Federated Learning

Adaptive Parameterization of Deep Learning Models for Federated Learning

Arxiv

0+阅读 · 2023年2月6日

Guide the Learner: Controlling Product of Experts Debiasing Method Based on Token Attribution Similarities

Arxiv

0+阅读 · 2023年2月6日

First-Order Algorithms for Nonlinear Generalized Nash Equilibrium Problems

Arxiv

0+阅读 · 2023年2月5日

Robust Empirical Risk Minimization with Tolerance

Arxiv

0+阅读 · 2023年2月4日

A Feature Selection Method for Driver Stress Detection Using Heart Rate Variability and Breathing Rate

Arxiv

0+阅读 · 2023年2月3日

Clustered Embedding Learning for Recommender Systems

Arxiv

0+阅读 · 2023年2月3日

VIP会员

文章信息

相关主题

平坦最小值

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

相关论文

Mitigating Algorithmic Bias with Limited Annotations

Arxiv

0+阅读 · 2023年2月7日

Self-Sampling Training and Evaluation for the Accuracy-Bias Tradeoff in Recommendation

Arxiv

0+阅读 · 2023年2月7日

Sparse GEMINI for Joint Discriminative Clustering and Feature Selection

Arxiv

0+阅读 · 2023年2月7日

On the Ideal Number of Groups for Isometric Gradient Propagation

Arxiv

0+阅读 · 2023年2月7日

Adaptive Parameterization of Deep Learning Models for Federated Learning

Adaptive Parameterization of Deep Learning Models for Federated Learning

Arxiv

0+阅读 · 2023年2月6日

Guide the Learner: Controlling Product of Experts Debiasing Method Based on Token Attribution Similarities

Arxiv

0+阅读 · 2023年2月6日

First-Order Algorithms for Nonlinear Generalized Nash Equilibrium Problems

Arxiv

0+阅读 · 2023年2月5日

Robust Empirical Risk Minimization with Tolerance

Arxiv

0+阅读 · 2023年2月4日

A Feature Selection Method for Driver Stress Detection Using Heart Rate Variability and Breathing Rate

Arxiv

0+阅读 · 2023年2月3日

Clustered Embedding Learning for Recommender Systems

Arxiv

0+阅读 · 2023年2月3日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

地西他滨诱导人调节性gammadeltaT细胞Foxp3基因表达上调的分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

图论中的整数流与圆流

国家自然科学基金

0+阅读 · 2015年12月31日

蒲黄生行熟止的物质基础及质量评价

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA-TUSC7在胃癌中的抑癌作用及机制研究

国家自然科学基金

1+阅读 · 2014年12月31日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

uPA/uPAR在不伴息肉型慢性鼻-鼻窦炎病理过程中的作用及机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

DWI监测RNAi沉默AQP4治疗脑缺血半暗带的实验研究

国家自然科学基金

0+阅读 · 2009年12月31日

大肠癌组织样本的蛋白质组学和转录组学整合研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员