TUSLA的塔塔神经网络:通过适应性随机梯度Langevin算法进行非混凝土学习 (Taming neural networks with TUSLA: Non-convex learning via adaptive stochastic gradient Langevin algorithms)

Artificial neural networks (ANNs) are typically highly nonlinear systems which are finely tuned via the optimization of their associated, non-convex loss functions. In many cases, the gradient of any such loss function has superlinear growth, making the use of the widely-accepted (stochastic) gradient descent methods, which are based on Euler numerical schemes, problematic. We offer a new learning algorithm based on an appropriately constructed variant of the popular stochastic gradient Langevin dynamics (SGLD), which is called tamed unadjusted stochastic Langevin algorithm (TUSLA). We also provide a nonasymptotic analysis of the new algorithm's convergence properties in the context of non-convex learning problems with the use of ANNs. Thus, we provide finite-time guarantees for TUSLA to find approximate minimizers of both empirical and population risks. The roots of the TUSLA algorithm are based on the taming technology for diffusion processes with superlinear coefficients as developed in \citet{tamed-euler, SabanisAoAP} and for MCMC algorithms in \citet{tula}. Numerical experiments are presented which confirm the theoretical findings and illustrate the need for the use of the new algorithm in comparison to vanilla SGLD within the framework of ANNs.

翻译：人工神经网络(ANNS)通常是高度非线性系统,通过优化其相关、非Convex损失功能进行微调。在许多情况下,任何此类损失功能的梯度都有超线性增长,使用广泛接受的(随机)梯度下降方法,这些方法以Euler数值方法为基础,存在问题。我们提供了一种新的学习算法,其依据是流行的随机梯度梯度朗埃文动态(SGLD)(SGLD)(SGLD),它被称为“TUSLA ” (TUSLA) 。我们还提供了在使用非线性学习问题的背景下,任何此类损失函数的梯度增长都具有超线性分析。因此,我们为TUSLA提供了限定时间的保证,以适当构建的流行性梯度梯度梯度梯度梯度梯度梯度(SGLD) 动态(SGLD) (SGLD) (SGLD), 即以新线性超线性参数测试技术为基础,在\cite{tem-eural-eal) 的Agalus Adalgalationsus 和Nualationsural_Adsurationsurationsurations (NA) ASl) 和Nucal_Adsurationalbisl) ASl) 需要的理论分析结果。

相关内容

Neural Networks

关注 1648

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

南大《优化方法（Optimization Methods》课程，推荐！

专知会员服务

80+阅读 · 2022年4月3日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

最新《非光滑优化》十讲硬核课程，剑桥大学梁经纬博士主讲

专知会员服务

34+阅读 · 2020年8月14日