Deep neural networks are often overparameterized and may not easily achieve model generalization. Adversarial training has shown effectiveness in improving generalization by regularizing the change of loss on top of adversarially chosen perturbations. The recently proposed sharpness-aware minimization (SAM) algorithm adopts adversarial weight perturbation, encouraging the model to converging to a flat minima. Unfortunately, due to increased computational cost, adversarial weight perturbation can only be efficiently approximated per-batch instead of per-instance, leading to degraded performance. In this paper, we propose that dynamically reweighted perturbation within each batch, where unguarded instances are up-weighted, can serve as a better approximation to per-instance perturbation. We propose sharpness-aware minimization with dynamic reweighting ({\delta}-SAM), which realizes the idea with efficient guardedness estimation. Experiments on the GLUE benchmark demonstrate the effectiveness of {\delta}-SAM.
翻译:深海神经网络往往被过度分解,可能不易实现典型的概括化。 反向培训通过在对抗性选择的扰动之上对损失变化进行规范化,在改进一般化方面显示出了有效性。 最近提议的锐化觉悟最小化(SAM)算法采用了对抗性重量最小化(SAM)算法,鼓励该模型与平坦的微扰。 不幸的是,由于计算成本增加,对抗性重量扰动只能有效地近似于每批货物,而不是每批货物,从而导致性能退化。 在本文中,我们建议,在无防范性环境被提升的每批货物中,动态重新加权的扰动性扰动作用可以更接近每批货物的扰动性。 我们建议,通过动态再加权(_delta}-SAM)来实现敏化最小化。 GLUE 基准的实验表明 delta}SAM 的有效性 。