具有一般启动功能的神经网络损失表面 (The Loss Surfaces of Neural Networks with General Activation Functions)

The loss surfaces of deep neural networks have been the subject of several studies, theoretical and experimental, over the last few years. One strand of work considers the complexity, in the sense of local optima, of high dimensional random functions with the aim of informing how local optimisation methods may perform in such complicated settings. Prior work of Choromanska et al (2015) established a direct link between the training loss surfaces of deep multi-layer perceptron networks and spherical multi-spin glass models under some very strong assumptions on the network and its data. In this work, we test the validity of this approach by removing the undesirable restriction to ReLU activation functions. In doing so, we chart a new path through the spin glass complexity calculations using supersymmetric methods in Random Matrix Theory which may prove useful in other contexts. Our results shed new light on both the strengths and the weaknesses of spin glass models in this context.

翻译：过去几年来,深神经网络的损失面一直是若干研究、理论和实验研究的主题,其中一项工作考虑了高维随机功能的复杂性,从局部opima的意义上讲,高维随机功能是为了了解在如此复杂的环境下当地优化方法如何发挥作用。Choromanska等人(2015年)先前的工作在深多层透视网络和球形多面玻璃模型的培训损失面之间建立了直接联系,这些培训面是在网络及其数据的一些非常强烈的假设下建立的。在这项工作中,我们通过取消对RELU激活功能的不可取限制来检验这一方法的有效性。在这样做的时候,我们用随机矩阵理论中的超对称方法绘制了一条通过旋转玻璃复杂性计算的新路径,这在其它情况下可能证明是有用的。我们的结果为旋转玻璃模型在这方面的长处和短处提供了新的线索。

相关内容

Microsoft Surface

关注 5

Surface 是微软公司（ Microsoft）旗下一系列使用 Windows 10（早期为 Windows 8.X）操作系统的电脑产品，目前有 Surface、Surface Pro 和 Surface Book 三个系列。 2012 年 6 月 18 日，初代 Surface Pro/RT 由时任微软 CEO 史蒂夫·鲍尔默发布于在洛杉矶举行的记者会，2012 年 10 月 26 日上市销售。

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

可解释高效异构图卷积网络，Interpretable and Efficient Heterogeneous Graph Convolutional Network

专知会员服务

63+阅读 · 2020年7月12日