Foxy 代相近性:关于分析受梯世后代培训的神经网络的统一框架 (Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent)

Although the optimization objectives for learning neural networks are highly non-convex, gradient-based methods have been wildly successful at learning neural networks in practice. This juxtaposition has led to a number of recent studies on provable guarantees for neural networks trained by gradient descent. Unfortunately, the techniques in these works are often highly specific to the problem studied in each setting, relying on different assumptions on the distribution, optimization parameters, and network architectures, making it difficult to generalize across different settings. In this work, we propose a unified non-convex optimization framework for the analysis of neural network training. We introduce the notions of proxy convexity and proxy Polyak-Lojasiewicz (PL) inequalities, which are satisfied if the original objective function induces a proxy objective function that is implicitly minimized when using gradient methods. We show that stochastic gradient descent (SGD) on objectives satisfying proxy convexity or the proxy PL inequality leads to efficient guarantees for proxy objective functions. We further show that many existing guarantees for neural networks trained by gradient descent can be unified through proxy convexity and proxy PL inequalities.

翻译：虽然学习神经网络的最优化目标是高度非凝固的,但基于梯度的方法在实践中在学习神经网络方面非常成功。这一并列已经导致最近对通过梯度下降训练的神经网络的可证实的保障进行了一些研究。不幸的是,这些作品中的技术往往非常具体地解决在每种环境下研究的问题,依靠对分布、优化参数和网络结构的不同假设,使得难以在不同环境中推广。在这项工作中,我们提议了一个统一的非链化优化框架,用于分析神经网络培训。我们引入了代理性凝固和代理Polyak-Lojasiewicz(PL)不平等的概念,如果原始目标功能产生一种代用梯度方法时暗含最小化的代用目标功能,这些技术往往非常具体,我们表明,在满足代用粘度调或代用极不平等的目标上,可导致对代用目标功能的有效保障。我们进一步表明,通过代用粘度粘合和代用极不平等来统一由梯度下降所训练的神经网络的现有保障。

相关内容

Neural Networks

关注 1650

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

迁移学习简明教程，11页ppt

专知会员服务

109+阅读 · 2020年8月4日

商业数据分析，39页ppt

专知会员服务

165+阅读 · 2020年6月2日