神经网络优化离职 (Optimization-Based Separations for Neural Networks)

Depth separation results propose a possible theoretical explanation for the benefits of deep neural networks over shallower architectures, establishing that the former possess superior approximation capabilities. However, there are no known results in which the deeper architecture leverages this advantage into a provable optimization guarantee. We prove that when the data are generated by a distribution with radial symmetry which satisfies some mild assumptions, gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations, and where the hidden layer is held fixed throughout training. By building on and refining existing techniques for approximation lower bounds of neural networks with a single layer of non-linearities, we show that there are $d$-dimensional radial distributions on the data such that ball indicators cannot be learned efficiently by any algorithm to accuracy better than $\Omega(d^{-4})$, nor by a standard gradient descent implementation to accuracy better than a constant. These results establish what is to the best of our knowledge, the first optimization-based separations where the approximation benefits of the stronger architecture provably manifest in practice. Our proof technique introduces new tools and ideas that may be of independent interest in the theoretical study of both the approximation and optimization of neural networks.

翻译：深度分离结果为深海神经网络对浅层建筑的好处提供了可能的理论解释,确定前者拥有超强近似能力。然而,深层建筑将这一优势用于可变最佳保证,没有已知的结果。我们证明,当数据是通过放射对称分布产生的,符合一些轻度假设时,梯度下降可以有效地学习球指标功能,使用深2神经网络,具有两层模拟活化作用,并且在整个培训过程中将隐藏的层固定起来。通过建立和完善现有技术,接近具有单一非线性层神经网络下层的现有技术,我们表明,在数据上存在着美元-维线分布,因此球指标无法通过比美元/奥米加(d ⁇ -4})更精确的任何算法来有效地学习,或者通过标准的梯度下降执行来比常数更精确。这些结果确定了我们最了解的方面,第一次基于优化的分离,在其中,较强结构在实践中明显地展示了近似的利益。我们的证据技术引入了新的优化工具,以及理论性优化的理论性研究中可能引入了新的工具与想法。

相关内容

Neural Networks

关注 1651

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日