深层学习中随机矩阵理论的外观 (Appearance of Random Matrix Theory in Deep Learning)

We investigate the local spectral statistics of the loss surface Hessians of artificial neural networks, where we discover excellent agreement with Gaussian Orthogonal Ensemble statistics across several network architectures and datasets. These results shed new light on the applicability of Random Matrix Theory to modelling neural networks and suggest a previously unrecognised role for it in the study of loss surfaces in deep learning. Inspired by these observations, we propose a novel model for the true loss surfaces of neural networks, consistent with our observations, which allows for Hessian spectral densities with rank degeneracy and outliers, extensively observed in practice, and predicts a growing independence of loss gradients as a function of distance in weight-space. We further investigate the importance of the true loss surface in neural networks and find, in contrast to previous work, that the exponential hardness of locating the global minimum has practical consequences for achieving state of the art performance.

翻译：我们调查了当地损失表面的光谱统计人工神经网络的赫西人,我们在那里发现与高森 Orthogonal 集合统计在一些网络架构和数据集中达成了极好的一致,这些结果为随机矩阵理论适用于神经网络建模提供了新的依据,并提出了它以前在深层学习中研究损失表面方面尚未认识到的作用。我们根据这些观察,提出了神经网络真正损失表面的新模式,这与我们的观察一致,允许赫西人光谱密度达到在实际中广泛观测到的低度和异端,并预测损失梯度作为重量空间距离函数的日益独立。我们进一步调查了真正损失表面在神经网络中的重要性,发现与以往的工作不同,全球最小值定位的指数硬度对于实现艺术性能状态具有实际影响。

相关内容

矩阵论

关注 6

随着科学技术的迅速发展，古典的线性代数知识已不能满足现代科技的需要，矩阵的理论和方法业已成为现代科技领域必不可少的工具。诸如数值分析、优化理论、微分方程、概率统计、控制论、力学、电子学、网络等学科领域都与矩阵理论有着密切的联系，甚至在经济管理、金融、保险、社会科学等领域，矩阵理论和方法也有着十分重要的应用。当今电子计算机及计算技术的迅速发展为矩阵理论的应用开辟了更广阔的前景。因此，学习和掌握矩阵的基本理论和方法，对于工科研究生来说是必不可少的。全国的工科院校已普遍把“矩阵论”作为研究生的必修课。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【图与几何深度学习】Graph and geometric deep learning，49页ppt

专知会员服务

65+阅读 · 2021年4月24日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日