通过SGLD的亚非热带噪音优化信息-理论普遍化环形 (Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD) - 专知论文

会员服务 ·

0

优化器 · 泛化理论 · 噪声 · 经验风险 · 矩阵论 ·

2021 年 10 月 26 日

Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD

翻译：通过SGLD的亚非热带噪音优化信息-理论普遍化环形

Bohan Wang,Huishuai Zhang,Jieyu Zhang,Qi Meng,Wei Chen,Tie-Yan Liu

from arxiv, Accepted by Neurips 2021

Recently, the information-theoretical framework has been proven to be able to obtain non-vacuous generalization bounds for large models trained by Stochastic Gradient Langevin Dynamics (SGLD) with isotropic noise. In this paper, we optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD. We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance if both the prior and the posterior are jointly optimized. This validates that the optimal noise is quite close to the empirical gradient covariance. Technically, we develop a new information-theoretical bound that enables such an optimization analysis. We then apply matrix analysis to derive the form of optimal noise covariance. Presented constraint and results are validated by the empirical observations.

翻译：最近,事实证明,信息理论框架能够为由Stochastic Gradient Langevin Directives (SGLD) 训练的大型模型获得非空泛的通用范围,这些模型具有异地噪音。在本文中,我们优化了操纵SGLD噪音结构所约束的信息理论概括。我们证明,在保证低经验风险的制约下,最佳噪音共变是预期梯度共变的平方根,如果前一种和后一种都得到优化。这证明最佳噪音与实验性梯度共变相当接近。技术上,我们开发了一种新的信息理论约束,从而能够进行这种优化分析。然后我们运用矩阵分析来得出最佳噪音共变形式。现有的制约和结果通过实验性观察得到验证。

0

相关内容

优化器

【ICML2021】压缩最大似然

专知会员服务

22+阅读 · 2021年9月23日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

【ICCV 2019 Workshop】Geometric View of Optimal Transportation and Generative Adversarial Networks ，石溪大学，哈佛大学顾险峰教授

【ICCV 2019 Workshop】Geometric View of Optimal Transportation and Generative Adversarial Networks ，石溪大学，哈佛大学顾险峰教授

专知会员服务

26+阅读 · 2019年10月30日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

carla无人驾驶模拟中文项目 carla_simulator_Chinese

carla无人驾驶模拟中文项目 carla_simulator_Chinese

CreateAMind

3+阅读 · 2018年1月30日

NIPS 2017：贝叶斯深度学习与深度贝叶斯学习（讲义+视频）

NIPS 2017：贝叶斯深度学习与深度贝叶斯学习（讲义+视频）

机器学习研究会

36+阅读 · 2017年12月10日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Non Asymptotic Bounds for Optimization via Online Multiplicative Stochastic Gradient Descent

Non Asymptotic Bounds for Optimization via Online Multiplicative Stochastic Gradient Descent

Arxiv

0+阅读 · 2021年12月26日

Optimal and instance-dependent guarantees for Markovian linear stochastic approximation

Arxiv

0+阅读 · 2021年12月23日

The role of noise in PIC and Vlasov simulations of the Buneman instability

Arxiv

0+阅读 · 2021年12月23日

Distributed stochastic inertial-accelerated methods with delayed derivatives for nonconvex problems

Arxiv

0+阅读 · 2021年12月23日

Optimal Error Estimates of a Discontinuous Galerkin Method for the Navier-Stokes Equations

Arxiv

0+阅读 · 2021年12月23日

Multi-task Learning of Order-Consistent Causal Graphs

Arxiv

10+阅读 · 2021年11月3日

On the Expressivity of Markov Reward

Arxiv

3+阅读 · 2021年11月1日

Information-theoretic generalization bounds for black-box learning algorithms

Arxiv

12+阅读 · 2021年10月4日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2021】压缩最大似然

专知会员服务

22+阅读 · 2021年9月23日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

【快讯】ICML 2020论文出炉，1088篇上榜，你的paper中了吗？

专知会员服务

52+阅读 · 2020年6月1日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

【ICCV 2019 Workshop】Geometric View of Optimal Transportation and Generative Adversarial Networks ，石溪大学，哈佛大学顾险峰教授

【ICCV 2019 Workshop】Geometric View of Optimal Transportation and Generative Adversarial Networks ，石溪大学，哈佛大学顾险峰教授

专知会员服务

26+阅读 · 2019年10月30日

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

【课程】普林斯顿大学19年春季学期《机器学习优化》课程讲义

专知会员服务

85+阅读 · 2019年10月29日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

281+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】在低维和高维空间中分析、建模和转换潜在表征

从无人机到数据：揭示边缘计算作为新作战域

可解释人工智能的基础

大规模视觉模型中的基于提示的适应：综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

carla无人驾驶模拟中文项目 carla_simulator_Chinese

carla无人驾驶模拟中文项目 carla_simulator_Chinese

CreateAMind

3+阅读 · 2018年1月30日

NIPS 2017：贝叶斯深度学习与深度贝叶斯学习（讲义+视频）

NIPS 2017：贝叶斯深度学习与深度贝叶斯学习（讲义+视频）

机器学习研究会

36+阅读 · 2017年12月10日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Non Asymptotic Bounds for Optimization via Online Multiplicative Stochastic Gradient Descent

Non Asymptotic Bounds for Optimization via Online Multiplicative Stochastic Gradient Descent

Arxiv

0+阅读 · 2021年12月26日

Optimal and instance-dependent guarantees for Markovian linear stochastic approximation

Arxiv

0+阅读 · 2021年12月23日

The role of noise in PIC and Vlasov simulations of the Buneman instability

Arxiv

0+阅读 · 2021年12月23日

Distributed stochastic inertial-accelerated methods with delayed derivatives for nonconvex problems

Arxiv

0+阅读 · 2021年12月23日

Optimal Error Estimates of a Discontinuous Galerkin Method for the Navier-Stokes Equations

Arxiv

0+阅读 · 2021年12月23日

Multi-task Learning of Order-Consistent Causal Graphs

Arxiv

10+阅读 · 2021年11月3日

On the Expressivity of Markov Reward

Arxiv

3+阅读 · 2021年11月1日

Information-theoretic generalization bounds for black-box learning algorithms

Arxiv

12+阅读 · 2021年10月4日

Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data

Arxiv

9+阅读 · 2021年2月8日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

微信扫码咨询专知VIP会员