粒子两动:利用全球趋同率分析优化平均实地神经网络 (Particle Dual Averaging: Optimization of Mean Field Neural Networks with Global Convergence Rate Analysis) - 专知论文

会员服务 ·

0

优化器 · Neural Networks · 环 · Networking · 均值 ·

2021 年 6 月 11 日

Particle Dual Averaging: Optimization of Mean Field Neural Networks with Global Convergence Rate Analysis

翻译：粒子两动:利用全球趋同率分析优化平均实地神经网络

Atsushi Nitanda,Denny Wu,Taiji Suzuki

from arxiv, 39 pages

We propose the particle dual averaging (PDA) method, which generalizes the dual averaging method in convex optimization to the optimization over probability distributions with quantitative runtime guarantee. The algorithm consists of an inner loop and outer loop: the inner loop utilizes the Langevin algorithm to approximately solve for a stationary distribution, which is then optimized in the outer loop. The method can thus be interpreted as an extension of the Langevin algorithm to naturally handle nonlinear functional on the probability space. An important application of the proposed method is the optimization of neural network in the mean field regime, which is theoretically attractive due to the presence of nonlinear feature learning, but quantitative convergence rate can be challenging to obtain. By adapting finite-dimensional convex optimization theory into the space of distributions, we analyze PDA in regularized empirical / expected risk minimization, and establish quantitative global convergence in learning two-layer mean field neural networks under more general settings. Our theoretical results are supported by numerical simulations on neural networks with reasonable size.

翻译：我们建议了粒子双平均值(PDA)方法,该方法将二次平均法法(Convex优化法)一般化为优化概率分布的双均法(PDA),在数量运行时间保证下优化概率分布。算法包括一个内环和外环:内环利用Langevin算法(Langevin算法)大致解决固定分布,然后在外环中优化。因此,该方法可以被解释为Langevin算法(Langevin算法)的延伸,在概率空间上自然处理非线性功能。拟议方法的一个重要应用是在平均野外系统中优化神经网络,由于存在非线性特征学习,在理论上具有吸引力,但数量趋同率可能难以获得。通过在分布空间中调整有限维 convex优化理论(Langevin),我们在常规的经验/预期风险最小化中分析PDA,并在更一般环境中学习两层中中平均的外神经网络,建立定量的全球趋同。我们的理论结果得到合理规模神经网络数字模拟的支持。

0

相关内容

优化器

【Cell】神经算法推理，Neural algorithmic reasoning

【Cell】神经算法推理，Neural algorithmic reasoning

专知会员服务

29+阅读 · 2021年7月16日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020】图神经网络谱聚类

专知会员服务

43+阅读 · 2020年7月7日

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

专知会员服务

151+阅读 · 2020年6月28日

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

专知会员服务

69+阅读 · 2020年6月6日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

具有组合核的图神经网络，Graph Neural Networks with Composite Kernels

具有组合核的图神经网络，Graph Neural Networks with Composite Kernels

专知会员服务

59+阅读 · 2020年5月20日

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

专知会员服务

13+阅读 · 2019年12月9日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

已删除

inpluslab

8+阅读 · 2019年10月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

人工智能 | UAI 2019等国际会议信息4条

人工智能 | UAI 2019等国际会议信息4条

Call4Papers

6+阅读 · 2019年1月14日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

NeurIPS 2018最佳论文发布：华为诺亚方舟实验室等获奖，加拿大实力凸显

NeurIPS 2018最佳论文发布：华为诺亚方舟实验室等获奖，加拿大实力凸显

量子位

3+阅读 · 2018年12月4日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Convergence bounds for nonlinear least squares and applications to tensor recovery

Arxiv

0+阅读 · 2021年8月11日

A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions

Arxiv

0+阅读 · 2021年8月10日

On the Hyperparameters in Stochastic Gradient Descent with Momentum

Arxiv

0+阅读 · 2021年8月9日

Nonconvex sparse regularization for deep neural networks and its optimality

Arxiv

0+阅读 · 2021年8月9日

On the Convergence Rate of Projected Gradient Descent for a Back-Projection based Objective

Arxiv

0+阅读 · 2021年8月8日

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

Arxiv

6+阅读 · 2020年6月15日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

Neural Ordinary Differential Equations

Arxiv

6+阅读 · 2018年10月3日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

Combination of Hidden Markov Random Field and Conjugate Gradient for Brain Image Segmentation

Arxiv

8+阅读 · 2018年3月13日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

【Cell】神经算法推理，Neural algorithmic reasoning

【Cell】神经算法推理，Neural algorithmic reasoning

专知会员服务

29+阅读 · 2021年7月16日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020】图神经网络谱聚类

专知会员服务

43+阅读 · 2020年7月7日

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

【ICML2020】持续图神经网络，Continuous Graph Neural Networks

专知会员服务

151+阅读 · 2020年6月28日

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

【论文推荐】Stochastic Graph Neural Networks，随机图神经网络

专知会员服务

69+阅读 · 2020年6月6日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

具有组合核的图神经网络，Graph Neural Networks with Composite Kernels

具有组合核的图神经网络，Graph Neural Networks with Composite Kernels

专知会员服务

59+阅读 · 2020年5月20日

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

【NeurIPS2019|杰出新方向论文奖】统一收敛可能无法解释深度学习中的泛化性（Uniform convergence maybe unable to explain generalization in deep learning）

专知会员服务

13+阅读 · 2019年12月9日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

已删除

inpluslab

8+阅读 · 2019年10月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

人工智能 | UAI 2019等国际会议信息4条

人工智能 | UAI 2019等国际会议信息4条

Call4Papers

6+阅读 · 2019年1月14日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

计算机类 | LICS 2019等国际会议信息7条

计算机类 | LICS 2019等国际会议信息7条

Call4Papers

3+阅读 · 2018年12月17日

NeurIPS 2018最佳论文发布：华为诺亚方舟实验室等获奖，加拿大实力凸显

NeurIPS 2018最佳论文发布：华为诺亚方舟实验室等获奖，加拿大实力凸显

量子位

3+阅读 · 2018年12月4日

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

【论文推荐】最新六篇主题模型相关论文—领域特定知识库、神经变分推断、动态和静态主题模型

专知

19+阅读 · 2018年6月26日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Convergence bounds for nonlinear least squares and applications to tensor recovery

Arxiv

0+阅读 · 2021年8月11日

A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions

Arxiv

0+阅读 · 2021年8月10日

On the Hyperparameters in Stochastic Gradient Descent with Momentum

Arxiv

0+阅读 · 2021年8月9日

Nonconvex sparse regularization for deep neural networks and its optimality

Arxiv

0+阅读 · 2021年8月9日

On the Convergence Rate of Projected Gradient Descent for a Back-Projection based Objective

Arxiv

0+阅读 · 2021年8月8日

Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks

Arxiv

6+阅读 · 2020年6月15日

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Arxiv

8+阅读 · 2018年11月21日

Neural Ordinary Differential Equations

Arxiv

6+阅读 · 2018年10月3日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

Combination of Hidden Markov Random Field and Conjugate Gradient for Brain Image Segmentation

Arxiv

8+阅读 · 2018年3月13日

微信扫码咨询专知VIP会员