HERO: 统一和改进普遍化和量化绩效的大力优化 (HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving Generalization and Quantization Performance) - 专知论文

会员服务 ·

0

泛化理论 · 稳健性 · Performer · 模型评估 · MoDELS ·

2021 年 11 月 23 日

HERO: Hessian-Enhanced Robust Optimization for Unifying and Improving Generalization and Quantization Performance

翻译：HERO: 统一和改进普遍化和量化绩效的大力优化

Huanrui Yang,Xiaoxuan Yang,Neil Zhenqiang Gong,Yiran Chen

With the recent demand of deploying neural network models on mobile and edge devices, it is desired to improve the model's generalizability on unseen testing data, as well as enhance the model's robustness under fixed-point quantization for efficient deployment. Minimizing the training loss, however, provides few guarantees on the generalization and quantization performance. In this work, we fulfill the need of improving generalization and quantization performance simultaneously by theoretically unifying them under the framework of improving the model's robustness against bounded weight perturbation and minimizing the eigenvalues of the Hessian matrix with respect to model weights. We therefore propose HERO, a Hessian-enhanced robust optimization method, to minimize the Hessian eigenvalues through a gradient-based training process, simultaneously improving the generalization and quantization performance. HERO enables up to a 3.8% gain on test accuracy, up to 30% higher accuracy under 80% training label perturbation, and the best post-training quantization accuracy across a wide range of precision, including a >10% accuracy improvement over SGD-trained models for common model architectures on various datasets.

翻译：由于最近需要同时在移动和边缘装置上部署神经网络模型,因此希望改进该模型在隐蔽测试数据上的通用性,并提高该模型在固定点的稳健性能,以便有效部署。不过,尽量减少培训损失对通用性能和定量性能几乎没有什么保障。在这项工作中,我们满足了改进一般化和定量性能的需要,在理论上将其统一在改进模型的稳健性以对抗捆绑重量的扰动性能的框架之下,最大限度地减少赫森矩阵在模型重量方面的密封性值。因此,我们建议HERO,一种赫森加固的稳健性优化方法,通过基于梯度的培训进程尽量减少赫森电子基因值,同时改进一般化和定量性能。HERO在测试精度方面可以达到3.8%的收益,在80%的训练标签下达到30%的更高精度,并在广泛的精确度范围内,包括 >-10%的精确度模型上,在各种共同的SGDA模型上改进了10%的精确度。

0

相关内容

泛化理论

基于图的异常检测，94页ppt

专知会员服务

77+阅读 · 2021年9月27日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

Python编程基础，121页ppt

Python编程基础，121页ppt

专知会员服务

49+阅读 · 2021年1月1日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

122+阅读 · 2020年5月30日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

元学习（Meta Learning）最全论文、视频、书籍资源整理

元学习（Meta Learning）最全论文、视频、书籍资源整理

深度学习与NLP

22+阅读 · 2019年6月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度神经网络模型训练中的最新tricks总结【原理与代码汇总】

深度神经网络模型训练中的最新tricks总结【原理与代码汇总】

人工智能前沿讲习班

172+阅读 · 2019年3月6日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Post-training Quantization for Neural Networks with Provable Guarantees

Arxiv

0+阅读 · 2022年1月26日

Pruning Ternary Quantization

Arxiv

0+阅读 · 2022年1月26日

Optimization-Based Separations for Neural Networks

Arxiv

0+阅读 · 2022年1月24日

Boosting Active Learning via Improving Test Performance

Arxiv

0+阅读 · 2022年1月23日

Efficient and Robust Classification for Sparse Attacks

Arxiv

0+阅读 · 2022年1月23日

Adversarial Unlearning of Backdoors via Implicit Hypergradient

Arxiv

0+阅读 · 2022年1月22日

Adversarial Attacks on Graph Classification via Bayesian Optimisation

Arxiv

5+阅读 · 2021年11月4日

Zero-shot Adversarial Quantization

Arxiv

6+阅读 · 2021年3月30日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Arxiv

3+阅读 · 2019年9月12日

VIP会员

文章信息

相关主题

相关VIP内容

基于图的异常检测，94页ppt

专知会员服务

77+阅读 · 2021年9月27日

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

Python编程基础，121页ppt

Python编程基础，121页ppt

专知会员服务

49+阅读 · 2021年1月1日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

【ICML2020】噪声在随机梯度下降中的泛化效益，On the Generalization Benefit of Noise in Stochastic Gradient Descent

专知会员服务

19+阅读 · 2020年6月29日

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

122+阅读 · 2020年5月30日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《多智能体不确定环境追逃博弈研究》216页

美智库最新发布《解放军"人机编组协同作战"发展路径：理论与实践》53页

现代战争"杀伤区"理论：空间尺度与结构特征、控制手段与毁伤机制、生存策略与战线转移

《俄军无人机创新技术或已在乌克兰达成"战场空中封锁"作战效果》最新18页报告

相关资讯

鲁棒机器学习相关文献集

鲁棒机器学习相关文献集

专知

8+阅读 · 2019年8月18日

元学习（Meta Learning）最全论文、视频、书籍资源整理

元学习（Meta Learning）最全论文、视频、书籍资源整理

深度学习与NLP

22+阅读 · 2019年6月20日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度神经网络模型训练中的最新tricks总结【原理与代码汇总】

深度神经网络模型训练中的最新tricks总结【原理与代码汇总】

人工智能前沿讲习班

172+阅读 · 2019年3月6日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Post-training Quantization for Neural Networks with Provable Guarantees

Arxiv

0+阅读 · 2022年1月26日

Pruning Ternary Quantization

Arxiv

0+阅读 · 2022年1月26日

Optimization-Based Separations for Neural Networks

Arxiv

0+阅读 · 2022年1月24日

Boosting Active Learning via Improving Test Performance

Arxiv

0+阅读 · 2022年1月23日

Efficient and Robust Classification for Sparse Attacks

Arxiv

0+阅读 · 2022年1月23日

Adversarial Unlearning of Backdoors via Implicit Hypergradient

Arxiv

0+阅读 · 2022年1月22日

Adversarial Attacks on Graph Classification via Bayesian Optimisation

Arxiv

5+阅读 · 2021年11月4日

Zero-shot Adversarial Quantization

Arxiv

6+阅读 · 2021年3月30日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

Arxiv

3+阅读 · 2019年9月12日

微信扫码咨询专知VIP会员