反反剖面剖面的 Convex 几何测量: 神经网络网梯度流聚到双重剖面方案极端点 (The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program)

We study non-convex subgradient flows for training two-layer ReLU neural networks from a convex geometry and duality perspective. We characterize the implicit bias of unregularized non-convex gradient flow as convex regularization of an equivalent convex model. We then show that the limit points of non-convex subgradient flows can be identified via primal-dual correspondence in this convex optimization problem. Moreover, we derive a sufficient condition on the dual variables which ensures that the stationary points of the non-convex objective are the KKT points of the convex objective, thus proving convergence of non-convex gradient flows to the global optimum. For a class of regular training data distributions such as orthogonal separable data, we show that this sufficient condition holds. Therefore, non-convex gradient flows in fact converge to optimal solutions of a convex optimization problem. We present numerical results verifying the predictions of our theory for non-convex subgradient descent.

翻译：我们从剖面几何和双向角度研究非凝固亚梯度流,以培训两层 ReLU 神经网络。我们把非正统非convex 梯度流的隐含偏差定性为对等的锥形模型的二次曲线正规化。我们然后表明,非convex 亚梯度流的极限点可以通过这个锥形优化问题的原始-双向通信来确定。此外,我们从双重变量中得出一个充分的条件,确保非convex 目标的固定点是锥形目标的KKT点,从而证明非convex 梯度流与全球最佳水平的趋同。对于诸如正统相异的相等一系列定期培训数据分布而言,我们表明这一足够的条件是维持的。因此,非convex 梯度流事实上会与对锥形优化问题的最佳解决办法趋同。我们提出了数字结果,以核实我们对非convex 亚梯度下血统理论的预测。

相关内容

Neural Networks

关注 1648

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

不可错过！华盛顿大学最新《生成式模型》课程，附PPT

专知会员服务

65+阅读 · 2020年12月11日

专知会员服务

39+阅读 · 2020年11月3日

【斯坦福】凸优化圣经- Convex Optimization （附730pdf下载）

专知会员服务

229+阅读 · 2020年6月5日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》