TUSLA的塔塔神经网络:通过适应性随机梯度Langevin算法进行非混凝土学习 (Taming neural networks with TUSLA: Non-convex learning via adaptive stochastic gradient Langevin algorithms)

Artificial neural networks (ANNs) are typically highly nonlinear systems which are finely tuned via the optimization of their associated, non-convex loss functions. Typically, the gradient of any such loss function fails to be dissipative making the use of widely-accepted (stochastic) gradient descent methods problematic. We offer a new learning algorithm based on an appropriately constructed variant of the popular stochastic gradient Langevin dynamics (SGLD), which is called tamed unadjusted stochastic Langevin algorithm (TUSLA). We also provide a nonasymptotic analysis of the new algorithm's convergence properties in the context of non-convex learning problems with the use of ANNs. Thus, we provide finite-time guarantees for TUSLA to find approximate minimizers of both empirical and population risks. The roots of the TUSLA algorithm are based on the taming technology for diffusion processes with superlinear coefficients as developed in \citet{tamed-euler, SabanisAoAP} and for MCMC algorithms in \citet{tula}. Numerical experiments are presented which confirm the theoretical findings and illustrate the need for the use of the new algorithm in comparison to vanilla SGLD within the framework of ANNs.

翻译：人工神经网络(ANNS)通常是高度非线性系统,通过优化其相关、非孔雀损失功能进行微调。通常,任何此类损失函数的梯度不会分散,以致使用广泛接受的(随机)梯度下降方法有问题。我们提供了一种新的学习算法,其基础是流行的随机梯度梯度朗埃文动态(SGLD)(SGLD)的恰当构建变式,称为调制的未经调整的超线性随机朗埃文算法(TUSLA),我们还提供了在使用非康牛学习问题的情况下对新算法趋同特性的非抽调性分析。因此,我们为TUSLA提供了有限时间保证,以找到经验风险和人口风险的近似最小化因素。TUSLA算法的基础是在\citet-eural中开发的用于超线性系数传播过程的调制技术。Sabiani-eler、Sabani-AAP}和用于MC MILLA的理论模型框架中的NUMALLA,用以证实新分析结果中的NULD。

相关内容

Neural Networks

关注 1648

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

INRIA最新「机器学习理论」新书，229页pdf原理性阐述机器学习

专知会员服务

69+阅读 · 2021年3月27日

【Google】梯度下降，48页ppt

专知会员服务

81+阅读 · 2020年12月5日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日