SAMSON: 基于异常值归一化的锐化感知最小化算法，提高 DNN 泛化性能和鲁棒性 (SAMSON: Sharpness-Aware Minimization Scaled by Outlier Normalization for Improving DNN Generalization and Robustness) - 专知论文

会员服务 ·

0

Performer · DNN · 稳健性 · 异常点 · 泛化理论 ·

2023 年 3 月 21 日

SAMSON: Sharpness-Aware Minimization Scaled by Outlier Normalization for Improving DNN Generalization and Robustness

翻译：SAMSON: 基于异常值归一化的锐化感知最小化算法，提高 DNN 泛化性能和鲁棒性

Gonçalo Mordido,Sébastien Henwood,Sarath Chandar,François Leduc-Primeau

from arxiv, Preprint

Energy-efficient deep neural network (DNN) accelerators are prone to non-idealities that degrade DNN performance at inference time. To mitigate such degradation, existing methods typically add perturbations to the DNN weights during training to simulate inference on noisy hardware. However, this often requires knowledge about the target hardware and leads to a trade-off between DNN performance and robustness, decreasing the former to increase the latter. In this work, we show that applying sharpness-aware training, by optimizing for both the loss value and loss sharpness, significantly improves robustness to noisy hardware at inference time without relying on any assumptions about the target hardware. In particular, we propose a new adaptive sharpness-aware method that conditions the worst-case perturbation of a given weight not only on its magnitude but also on the range of the weight distribution. This is achieved by performing sharpness-aware minimization scaled by outlier minimization (SAMSON). Our approach outperforms existing sharpness-aware training methods both in terms of model generalization performance in noiseless regimes and robustness in noisy settings, as measured on several architectures and datasets.

翻译：能源高效的深度神经网络 (DNN) 加速器容易受到非理想因素的影响，在推理时降低 DNN 性能。为了缓解这种降级，现有的方法通常在训练过程中向 DNN 权重添加扰动，以模拟噪声硬件上的推理。然而，这往往需要关于目标硬件的知识，并导致性能和鲁棒性之间的权衡，降低前者以提高后者。在这项工作中，我们展示了应用锐度感知训练的优势，通过优化损失值和损失锐度来显著提高推理时对噪声硬件的鲁棒性，而不依赖于任何关于目标硬件的假设。特别地，我们提出了一种新的自适应锐度感知方法，它基于权重分布范围来调节给定权重的最坏情况扰动，而不仅仅是它的大小。这通过进行基于异常值归一化的锐化感知最小化 (SAMSON) 来实现。我们的方法在多个架构和数据集上均优于现有锐化感知训练方法，无噪声环境下优于模型的泛化性能，噪声环境下更加鲁棒。

0

相关内容

Performer

IJCAI2022开会了! 微软等《领域泛化Domain Generalization》教程，阐述DG最新进展，附PPT和视频

IJCAI2022开会了! 微软等《领域泛化Domain Generalization》教程，阐述DG最新进展，附PPT和视频

专知会员服务

60+阅读 · 2022年7月24日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【PKDD2020教程】可解释人工智能XAI:算法到应用，200页ppt

【PKDD2020教程】可解释人工智能XAI:算法到应用，200页ppt

专知会员服务

101+阅读 · 2020年10月13日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

55+阅读 · 2020年5月26日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

AI/ML/DNN硬件加速设计怎么入门？

AI/ML/DNN硬件加速设计怎么入门？

StarryHeavensAbove

11+阅读 · 2018年12月4日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【干货】Batch Normalization: 如何更快地训练深度神经网络

【干货】Batch Normalization: 如何更快地训练深度神经网络

专知

13+阅读 · 2018年3月6日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

GB-InSAR图像误差特征分析与改正模型研究

国家自然科学基金

0+阅读 · 2014年12月31日

运动宽带目标的波达方向估计研究

国家自然科学基金

0+阅读 · 2013年12月31日

部分遮挡下人脸人耳融合识别方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

Ad hoc网络中基于博弈论的激励合作路由算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

功率变换器非线性不稳定行为的washout滤波器控制方法

国家自然科学基金

0+阅读 · 2012年12月31日

认知无线网络中基于压缩感知的自适应非重构宽带频谱感知方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

炼钢连铸生产过程的自适应鲁棒调度方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

自适应滤波器的稳态均方误差和跟踪性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

宽带频谱压缩感知与自适应分配算法

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

Improving Adversarial Robustness via Joint Classification and Multiple Explicit Detection Classes

Arxiv

0+阅读 · 2023年5月10日

Bayesian variance change point detection with credible sets

Arxiv

0+阅读 · 2023年5月10日

Post-training Model Quantization Using GANs for Synthetic Data Generation

Arxiv

0+阅读 · 2023年5月10日

Ranking & Reweighting Improves Group Distributional Robustness

Arxiv

0+阅读 · 2023年5月9日

Generalized Out-of-Distribution Detection: A Survey

Generalized Out-of-Distribution Detection: A Survey

Arxiv

15+阅读 · 2021年10月21日

Multi-Domain Multi-Task Rehearsal for Lifelong Learning

Multi-Domain Multi-Task Rehearsal for Lifelong Learning

Arxiv

12+阅读 · 2020年12月14日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Arxiv

88+阅读 · 2019年3月27日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

VIP会员

文章信息

相关主题

相关VIP内容

IJCAI2022开会了! 微软等《领域泛化Domain Generalization》教程，阐述DG最新进展，附PPT和视频

IJCAI2022开会了! 微软等《领域泛化Domain Generalization》教程，阐述DG最新进展，附PPT和视频

专知会员服务

60+阅读 · 2022年7月24日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【PKDD2020教程】可解释人工智能XAI:算法到应用，200页ppt

【PKDD2020教程】可解释人工智能XAI:算法到应用，200页ppt

专知会员服务

101+阅读 · 2020年10月13日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

【CVPR2020】通过自适应GANs生成不同的图像，Diverse Image Generation via Self-Conditioned GANs

专知会员服务

34+阅读 · 2020年6月19日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

55+阅读 · 2020年5月26日

【ACL2020】对抗性文本生成，Improving Adversarial Text Generation

专知会员服务

52+阅读 · 2020年5月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

深度卷积神经网络中的降采样

深度卷积神经网络中的降采样

极市平台

12+阅读 · 2019年5月24日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

AI/ML/DNN硬件加速设计怎么入门？

AI/ML/DNN硬件加速设计怎么入门？

StarryHeavensAbove

11+阅读 · 2018年12月4日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【干货】Batch Normalization: 如何更快地训练深度神经网络

【干货】Batch Normalization: 如何更快地训练深度神经网络

专知

13+阅读 · 2018年3月6日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Improving Adversarial Robustness via Joint Classification and Multiple Explicit Detection Classes

Arxiv

0+阅读 · 2023年5月10日

Bayesian variance change point detection with credible sets

Arxiv

0+阅读 · 2023年5月10日

Post-training Model Quantization Using GANs for Synthetic Data Generation

Arxiv

0+阅读 · 2023年5月10日

Ranking & Reweighting Improves Group Distributional Robustness

Arxiv

0+阅读 · 2023年5月9日

Generalized Out-of-Distribution Detection: A Survey

Generalized Out-of-Distribution Detection: A Survey

Arxiv

15+阅读 · 2021年10月21日

Multi-Domain Multi-Task Rehearsal for Lifelong Learning

Multi-Domain Multi-Task Rehearsal for Lifelong Learning

Arxiv

12+阅读 · 2020年12月14日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

Prime Sample Attention in Object Detection

Arxiv

13+阅读 · 2019年4月9日

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Small Data Challenges in Big Data Era: A Survey of Recent Progress on Unsupervised and Semi-Supervised Methods

Arxiv

88+阅读 · 2019年3月27日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

相关基金

GB-InSAR图像误差特征分析与改正模型研究

国家自然科学基金

0+阅读 · 2014年12月31日

运动宽带目标的波达方向估计研究

国家自然科学基金

0+阅读 · 2013年12月31日

部分遮挡下人脸人耳融合识别方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

Ad hoc网络中基于博弈论的激励合作路由算法研究

国家自然科学基金

0+阅读 · 2013年12月31日

功率变换器非线性不稳定行为的washout滤波器控制方法

国家自然科学基金

0+阅读 · 2012年12月31日

认知无线网络中基于压缩感知的自适应非重构宽带频谱感知方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

炼钢连铸生产过程的自适应鲁棒调度方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

自适应滤波器的稳态均方误差和跟踪性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

宽带频谱压缩感知与自适应分配算法

国家自然科学基金

0+阅读 · 2011年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员