优化每个样本梯度正则化以提高从嘈杂数据中获取的学习信号 (Per-Example Gradient Regularization Improves Learning Signals from Noisy Data) - 专知论文

会员服务 ·

0

正则化 · 梯度 · 噪声 · 测试误差 · 扰动 ·

2023 年 3 月 31 日

Per-Example Gradient Regularization Improves Learning Signals from Noisy Data

翻译：优化每个样本梯度正则化以提高从嘈杂数据中获取的学习信号

Xuran Meng,Yuan Cao,Difan Zou

Gradient regularization, as described in \citet{barrett2021implicit}, is a highly effective technique for promoting flat minima during gradient descent. Empirical evidence suggests that this regularization technique can significantly enhance the robustness of deep learning models against noisy perturbations, while also reducing test error. In this paper, we explore the per-example gradient regularization (PEGR) and present a theoretical analysis that demonstrates its effectiveness in improving both test error and robustness against noise perturbations. Specifically, we adopt a signal-noise data model from \citet{cao2022benign} and show that PEGR can learn signals effectively while suppressing noise. In contrast, standard gradient descent struggles to distinguish the signal from the noise, leading to suboptimal generalization performance. Our analysis reveals that PEGR penalizes the variance of pattern learning, thus effectively suppressing the memorization of noises from the training data. These findings underscore the importance of variance control in deep learning training and offer useful insights for developing more effective training approaches.

翻译：梯度正则化，如\citet{barrett2021implicit}所述，是促进梯度下降过程中平坦最小值的高效技术。实证证据表明，这种正则化技术可以显著提高深度学习模型对嘈杂扰动的鲁棒性，同时降低测试误差。在本文中，我们探索了每个样本梯度正则化（PEGR），并提出了一种理论分析，证明了它在提高测试误差和抵抗噪声扰动方面的有效性。具体而言，我们采用了\citet{cao2022benign}的信号噪声数据模型，并显示PEGR可以有效地学习信号并抑制噪声。相反，标准梯度下降难以区分信号和噪声，导致次优的泛化性能。我们的分析揭示了PEGR惩罚模式学习的方差，从而有效地抑制了对训练数据中噪声的记忆化。这些发现强调了深度学习训练中方差控制的重要性，并为开发更有效的训练方法提供了有用的见解。

0

相关内容

正则化

在数学，统计学和计算机科学中，尤其是在机器学习和逆问题中，正则化是添加信息以解决不适定问题或防止过度拟合的过程。正则化适用于不适定的优化问题中的目标函数。

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

63+阅读 · 2023年2月15日

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【NAACL2022】自然语言处理的对比数据与学习

【NAACL2022】自然语言处理的对比数据与学习

专知会员服务

46+阅读 · 2022年7月10日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

专知会员服务

29+阅读 · 2020年2月22日

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

专知会员服务

102+阅读 · 2019年12月9日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

数据中心资源利用率敏感的编译方法

国家自然科学基金

0+阅读 · 2015年12月31日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于相依数据的梯度学习理论研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向非线性非高斯数据的因果结构学习算法研究

国家自然科学基金

2+阅读 · 2013年12月31日

确定性单光子的固态量子存储

国家自然科学基金

0+阅读 · 2013年12月31日

核函数优化选择的关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于先验知识的支持向量机的最优化模型与算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于信息表示与传导机制的异质agent计算金融模型

国家自然科学基金

0+阅读 · 2011年12月31日

SHOX基因下游增强子的识别及调控活性分析

国家自然科学基金

0+阅读 · 2011年12月31日

淫羊藿总黄酮调控骨性关节炎p38MAPK信号转导通路的研究

国家自然科学基金

0+阅读 · 2010年12月31日

Hierarchical Compositional Representations for Few-shot Action Recognition

Arxiv

0+阅读 · 2023年5月19日

Counterfactuals for Design: A Model-Agnostic Method For Design Recommendations

Arxiv

0+阅读 · 2023年5月18日

BERM: Training the Balanced and Extractable Representation for Matching to Improve Generalization Ability of Dense Retrieval

Arxiv

0+阅读 · 2023年5月18日

RobustFair: Adversarial Evaluation through Fairness Confusion Directed Gradient Search

Arxiv

0+阅读 · 2023年5月18日

Catch-Up Distillation: You Only Need to Train Once for Accelerating Sampling

Arxiv

0+阅读 · 2023年5月18日

Leveraging Demonstrations to Improve Online Learning: Quality Matters

Leveraging Demonstrations to Improve Online Learning: Quality Matters

Arxiv

0+阅读 · 2023年5月17日

LeTI: Learning to Generate from Textual Interactions

Arxiv

0+阅读 · 2023年5月17日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

63+阅读 · 2023年2月15日

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【NAACL2022】自然语言处理的对比数据与学习

【NAACL2022】自然语言处理的对比数据与学习

专知会员服务

46+阅读 · 2022年7月10日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

专知会员服务

29+阅读 · 2020年2月22日

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

【UMD开放书】机器学习课程书册，19章227页pdf，带你学习ML

专知会员服务

102+阅读 · 2019年12月9日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Hierarchical Compositional Representations for Few-shot Action Recognition

Arxiv

0+阅读 · 2023年5月19日

Counterfactuals for Design: A Model-Agnostic Method For Design Recommendations

Arxiv

0+阅读 · 2023年5月18日

BERM: Training the Balanced and Extractable Representation for Matching to Improve Generalization Ability of Dense Retrieval

Arxiv

0+阅读 · 2023年5月18日

RobustFair: Adversarial Evaluation through Fairness Confusion Directed Gradient Search

Arxiv

0+阅读 · 2023年5月18日

Catch-Up Distillation: You Only Need to Train Once for Accelerating Sampling

Arxiv

0+阅读 · 2023年5月18日

Leveraging Demonstrations to Improve Online Learning: Quality Matters

Leveraging Demonstrations to Improve Online Learning: Quality Matters

Arxiv

0+阅读 · 2023年5月17日

LeTI: Learning to Generate from Textual Interactions

Arxiv

0+阅读 · 2023年5月17日

A Survey of Learning on Small Data

Arxiv

19+阅读 · 2022年7月29日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Feature Denoising for Improving Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

Arxiv

15+阅读 · 2018年12月9日

相关基金

数据中心资源利用率敏感的编译方法

国家自然科学基金

0+阅读 · 2015年12月31日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

基于相依数据的梯度学习理论研究

国家自然科学基金

1+阅读 · 2015年12月31日

面向非线性非高斯数据的因果结构学习算法研究

国家自然科学基金

2+阅读 · 2013年12月31日

确定性单光子的固态量子存储

国家自然科学基金

0+阅读 · 2013年12月31日

核函数优化选择的关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于先验知识的支持向量机的最优化模型与算法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于信息表示与传导机制的异质agent计算金融模型

国家自然科学基金

0+阅读 · 2011年12月31日

SHOX基因下游增强子的识别及调控活性分析

国家自然科学基金

0+阅读 · 2011年12月31日

淫羊藿总黄酮调控骨性关节炎p38MAPK信号转导通路的研究

国家自然科学基金

0+阅读 · 2010年12月31日

微信扫码咨询专知VIP会员