TUSLA的塔塔神经网络:通过适应性随机梯度Langevin算法进行非混凝土学习 (Taming neural networks with TUSLA: Non-convex learning via adaptive stochastic gradient Langevin algorithms) - 专知论文

会员服务 ·

0

Learning · Networking · Neural Networks · 损失 · CASES ·

2023 年 1 月 15 日

Taming neural networks with TUSLA: Non-convex learning via adaptive stochastic gradient Langevin algorithms

翻译：TUSLA的塔塔神经网络:通过适应性随机梯度Langevin算法进行非混凝土学习

Attila Lovas,Iosif Lytras,Miklós Rásonyi,Sotirios Sabanis

Artificial neural networks (ANNs) are typically highly nonlinear systems which are finely tuned via the optimization of their associated, non-convex loss functions. In many cases, the gradient of any such loss function has superlinear growth, making the use of the widely-accepted (stochastic) gradient descent methods, which are based on Euler numerical schemes, problematic. We offer a new learning algorithm based on an appropriately constructed variant of the popular stochastic gradient Langevin dynamics (SGLD), which is called tamed unadjusted stochastic Langevin algorithm (TUSLA). We also provide a nonasymptotic analysis of the new algorithm's convergence properties in the context of non-convex learning problems with the use of ANNs. Thus, we provide finite-time guarantees for TUSLA to find approximate minimizers of both empirical and population risks. The roots of the TUSLA algorithm are based on the taming technology for diffusion processes with superlinear coefficients as developed in \citet{tamed-euler, SabanisAoAP} and for MCMC algorithms in \citet{tula}. Numerical experiments are presented which confirm the theoretical findings and illustrate the need for the use of the new algorithm in comparison to vanilla SGLD within the framework of ANNs.

翻译：人工神经网络(ANNS)通常是高度非线性系统,通过优化其相关、非Convex损失功能进行微调。在许多情况下,任何此类损失功能的梯度都有超线性增长,使用广泛接受的(随机)梯度下降方法,这些方法以Euler数值方法为基础,存在问题。我们提供了一种新的学习算法,其依据是流行的随机梯度梯度朗埃文动态(SGLD)(SGLD)(SGLD),它被称为“TUSLA ” (TUSLA) 。我们还提供了在使用非线性学习问题的背景下,任何此类损失函数的梯度增长都具有超线性分析。因此,我们为TUSLA提供了限定时间的保证,以适当构建的流行性梯度梯度梯度梯度梯度梯度梯度(SGLD) 动态(SGLD) (SGLD) (SGLD), 即以新线性超线性参数测试技术为基础,在\cite{tem-eural-eal) 的Agalus Adalgalationsus 和Nualationsural_Adsurationsurationsurations (NA) ASl) 和Nucal_Adsurationalbisl) ASl) 需要的理论分析结果。

0

相关内容

Learning

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

65+阅读 · 2023年2月15日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

近地空间环境下含Sc铝合金的高速撞击特性研究及可靠性评估

国家自然科学基金

0+阅读 · 2014年12月31日

基于Fermi-LAT和AMS-02的暗物质理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

波浪场中有机磷在多相微界面迁移转化过程与关键控制因子

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

细水雾抑制航天飞行器中火灾的研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

混合随机系统的动力学行为

国家自然科学基金

1+阅读 · 2011年12月31日

Langmuir环流在上层海洋混合中的作用

国家自然科学基金

0+阅读 · 2008年12月31日

Model reduction for stochastic systems with nonlinear drift

Arxiv

0+阅读 · 2023年3月9日

Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

Arxiv

0+阅读 · 2023年3月9日

Faster Adaptive Federated Learning

Arxiv

0+阅读 · 2023年3月9日

Stochastic Variable Metric Proximal Gradient with variance reduction for non-convex composite optimization

Arxiv

0+阅读 · 2023年3月8日

Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning

Arxiv

0+阅读 · 2023年3月8日

Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent

Arxiv

0+阅读 · 2023年3月7日

Amplitude-Varying Perturbation for Balancing Privacy and Utility in Federated Learning

Arxiv

0+阅读 · 2023年3月7日

Efficient Quantum Algorithms for Nonlinear Stochastic Dynamical Systems

Arxiv

0+阅读 · 2023年3月7日

Enhanced Adaptive Gradient Algorithms for Nonconvex-PL Minimax Optimization

Arxiv

0+阅读 · 2023年3月7日

On backward smoothing algorithms

Arxiv

0+阅读 · 2023年3月6日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

65+阅读 · 2023年2月15日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人机协同作战规划：来自美海军陆战队的大语言模型（LLM）使用教训

对北约军事总部战略规划制定与实施的研究 | 140页

美联参会指南-联合规划与执行概述及政策框架 | 32页

俄罗斯军事规划差异性凸显其思维的重要性 | 2025最新文献

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Model reduction for stochastic systems with nonlinear drift

Arxiv

0+阅读 · 2023年3月9日

Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

Arxiv

0+阅读 · 2023年3月9日

Faster Adaptive Federated Learning

Arxiv

0+阅读 · 2023年3月9日

Stochastic Variable Metric Proximal Gradient with variance reduction for non-convex composite optimization

Arxiv

0+阅读 · 2023年3月8日

Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning

Arxiv

0+阅读 · 2023年3月8日

Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent

Arxiv

0+阅读 · 2023年3月7日

Amplitude-Varying Perturbation for Balancing Privacy and Utility in Federated Learning

Arxiv

0+阅读 · 2023年3月7日

Efficient Quantum Algorithms for Nonlinear Stochastic Dynamical Systems

Arxiv

0+阅读 · 2023年3月7日

Enhanced Adaptive Gradient Algorithms for Nonconvex-PL Minimax Optimization

Arxiv

0+阅读 · 2023年3月7日

On backward smoothing algorithms

Arxiv

0+阅读 · 2023年3月6日

相关基金

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

近地空间环境下含Sc铝合金的高速撞击特性研究及可靠性评估

国家自然科学基金

0+阅读 · 2014年12月31日

基于Fermi-LAT和AMS-02的暗物质理论研究

国家自然科学基金

0+阅读 · 2013年12月31日

波浪场中有机磷在多相微界面迁移转化过程与关键控制因子

国家自然科学基金

0+阅读 · 2013年12月31日

可压缩Navier-Stokes方程和Boltzmann方程解的渐近行为

国家自然科学基金

0+阅读 · 2013年12月31日

Vlasov-Poisson-Boltzmann方程研究

国家自然科学基金

0+阅读 · 2013年12月31日

细水雾抑制航天飞行器中火灾的研究

国家自然科学基金

0+阅读 · 2012年12月31日

实时安全关键系统的建模、仿真与验证

国家自然科学基金

1+阅读 · 2012年12月31日

混合随机系统的动力学行为

国家自然科学基金

1+阅读 · 2011年12月31日

Langmuir环流在上层海洋混合中的作用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员