神经网络景观的热噪声地形绘制 (Charting the Topography of the Neural Network Landscape with Thermal-Like Noise) - 专知论文

会员服务 ·

0

波动 · 损失 · 决策边界 · 绘制 · 噪声 ·

2023 年 4 月 18 日

Charting the Topography of the Neural Network Landscape with Thermal-Like Noise

翻译：神经网络景观的热噪声地形绘制

Theo Jules,Gal Brener,Tal Kachman,Noam Levi,Yohai Bar-Sinai

from arxiv, 7 pages, 4 figures

The training of neural networks is a complex, high-dimensional, non-convex and noisy optimization problem whose theoretical understanding is interesting both from an applicative perspective and for fundamental reasons. A core challenge is to understand the geometry and topography of the landscape that guides the optimization. In this work, we employ standard Statistical Mechanics methods, namely, phase-space exploration using Langevin dynamics, to study this landscape for an over-parameterized fully connected network performing a classification task on random data. Analyzing the fluctuation statistics, in analogy to thermal dynamics at a constant temperature, we infer a clear geometric description of the low-loss region. We find that it is a low-dimensional manifold whose dimension can be readily obtained from the fluctuations. Furthermore, this dimension is controlled by the number of data points that reside near the classification decision boundary. Importantly, we find that a quadratic approximation of the loss near the minimum is fundamentally inadequate due to the exponential nature of the decision boundary and the flatness of the low-loss region. This causes the dynamics to sample regions with higher curvature at higher temperatures, while producing quadratic-like statistics at any given temperature. We explain this behavior by a simplified loss model which is analytically tractable and reproduces the observed fluctuation statistics.

翻译：神经网络训练是一个复杂、高维、非凸和嘈杂的优化问题，其理论理解是应用角度和基本原因都非常有趣的。一个核心挑战是理解指导优化的景观的几何和地形。在这项工作中，我们采用标准的统计力学方法，即使用Langevin动力学进行相空间探索，来研究一个在随机数据上执行分类任务的过度参数化的全连接网络的景观。通过分析波动统计数据，类比于常温下的热力学，我们推断出了对低损失区域的清晰几何描述。我们发现，它是一个低维流形，其维度可以从波动中轻松获得。而且，这个维度受到处于分类决策边界附近的数据点数量的控制。重要的是，我们发现，由于决策边界的指数特性和低损失区域的平坦性，最小值附近的损失的二次近似是根本不充分的。这导致动态在更高温度下采样具有更高曲率的区域，同时在任何给定温度下产生类似二次的统计数据。我们通过一个简化的损失模型来解释这种行为，该模型在分析上是可行的，并且可以复制所观察到的波动统计数据。

0

相关内容

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

【牛津大学博士论文】流形的几何优化与深度学习的应用，154页pdf，Geometric Optimisation on Manifolds with Applications to Deep Learning

【牛津大学博士论文】流形的几何优化与深度学习的应用，154页pdf，Geometric Optimisation on Manifolds with Applications to Deep Learning

专知会员服务

22+阅读 · 2022年3月21日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

48+阅读 · 2022年2月18日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【斯坦福大学博士论文】大规模和高维统计学习方法和算法，147页pdf， Large-scale and high-dimensional statistical learning methods and algorithms

专知会员服务

26+阅读 · 2020年6月13日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

【MIT】图神经网络的泛化与表示极限，《Generalization and Representational Limits of Graph Neural Networks》

【MIT】图神经网络的泛化与表示极限，《Generalization and Representational Limits of Graph Neural Networks》

专知会员服务

46+阅读 · 2020年2月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

珠江三角洲农业土壤重金属污染多尺度分异及其景观异质性机制

国家自然科学基金

0+阅读 · 2015年12月31日

时空噪声激励下可激发系统的随机响应及应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

变化环境下干旱区内陆艾比湖流域景观格局演变与水资源的相互作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

MIR396对水稻小穗发育的调控机理

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥一个编码RRM蛋白的剪接因子参与ABA响应的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于元胞自动机的随时间演化非力学问题的结构拓扑优化研究

国家自然科学基金

0+阅读 · 2012年12月31日

pH响应刷状共聚物及其载药胶束的多尺度构效关系

国家自然科学基金

0+阅读 · 2011年12月31日

基于分布式水文模型的流域尺度土壤湿度遥感数据同化研究

国家自然科学基金

0+阅读 · 2009年12月31日

城市化进程中城市热力景观格局演变的特征分析与情景模拟

国家自然科学基金

0+阅读 · 2009年12月31日

Neural Differential Recurrent Neural Network with Adaptive Time Steps

Arxiv

0+阅读 · 2023年6月2日

Multi-Granularity Archaeological Dating of Chinese Bronze Dings Based on a Knowledge-Guided Relation Graph

Arxiv

0+阅读 · 2023年6月2日

Neural Ideal Large Eddy Simulation: Modeling Turbulence with Neural Stochastic Differential Equations

Arxiv

0+阅读 · 2023年6月1日

How We Ruined The Internet

Arxiv

0+阅读 · 2023年6月1日

What Can Be Learnt With Wide Convolutional Neural Networks?

Arxiv

0+阅读 · 2023年5月31日

Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network

Arxiv

0+阅读 · 2023年5月30日

Data Fusion: Theory, Methods, and Applications

Arxiv

93+阅读 · 2022年8月2日

Graph Neural Networks for Recommender Systems: Challenges, Methods, and Directions

Arxiv

31+阅读 · 2021年9月27日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Arxiv

18+阅读 · 2021年1月28日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】工程和科学中的概率和统计，

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

【牛津大学博士论文】流形的几何优化与深度学习的应用，154页pdf，Geometric Optimisation on Manifolds with Applications to Deep Learning

【牛津大学博士论文】流形的几何优化与深度学习的应用，154页pdf，Geometric Optimisation on Manifolds with Applications to Deep Learning

专知会员服务

22+阅读 · 2022年3月21日

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

48+阅读 · 2022年2月18日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

【斯坦福大学博士论文】大规模和高维统计学习方法和算法，147页pdf， Large-scale and high-dimensional statistical learning methods and algorithms

专知会员服务

26+阅读 · 2020年6月13日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

【MIT】图神经网络的泛化与表示极限，《Generalization and Representational Limits of Graph Neural Networks》

【MIT】图神经网络的泛化与表示极限，《Generalization and Representational Limits of Graph Neural Networks》

专知会员服务

46+阅读 · 2020年2月23日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【斯坦福博士论文】计算受限的持续学习：基础与算法

生成式人工智能时代的多目标推荐：最新进展与未来展望综述

AI大模型技术在电力系统中的应用及发展趋势

【ICML2025】SparseLoRA：利用上下文稀疏性加速大语言模型微调

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

GNN 新基准！Long Range Graph Benchmark

GNN 新基准！Long Range Graph Benchmark

图与推荐

0+阅读 · 2022年10月18日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Neural Differential Recurrent Neural Network with Adaptive Time Steps

Arxiv

0+阅读 · 2023年6月2日

Multi-Granularity Archaeological Dating of Chinese Bronze Dings Based on a Knowledge-Guided Relation Graph

Arxiv

0+阅读 · 2023年6月2日

Neural Ideal Large Eddy Simulation: Modeling Turbulence with Neural Stochastic Differential Equations

Arxiv

0+阅读 · 2023年6月1日

How We Ruined The Internet

Arxiv

0+阅读 · 2023年6月1日

What Can Be Learnt With Wide Convolutional Neural Networks?

Arxiv

0+阅读 · 2023年5月31日

Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network

Arxiv

0+阅读 · 2023年5月30日

Data Fusion: Theory, Methods, and Applications

Arxiv

93+阅读 · 2022年8月2日

Graph Neural Networks for Recommender Systems: Challenges, Methods, and Directions

Arxiv

31+阅读 · 2021年9月27日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Arxiv

18+阅读 · 2021年1月28日

Optimization for deep learning: theory and algorithms

Optimization for deep learning: theory and algorithms

Arxiv

106+阅读 · 2019年12月19日

相关基金

珠江三角洲农业土壤重金属污染多尺度分异及其景观异质性机制

国家自然科学基金

0+阅读 · 2015年12月31日

时空噪声激励下可激发系统的随机响应及应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

变化环境下干旱区内陆艾比湖流域景观格局演变与水资源的相互作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

MIR396对水稻小穗发育的调控机理

国家自然科学基金

0+阅读 · 2012年12月31日

拟南芥一个编码RRM蛋白的剪接因子参与ABA响应的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

夏季中尺度强降水天气系统的可预报性研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于元胞自动机的随时间演化非力学问题的结构拓扑优化研究

国家自然科学基金

0+阅读 · 2012年12月31日

pH响应刷状共聚物及其载药胶束的多尺度构效关系

国家自然科学基金

0+阅读 · 2011年12月31日

基于分布式水文模型的流域尺度土壤湿度遥感数据同化研究

国家自然科学基金

0+阅读 · 2009年12月31日

城市化进程中城市热力景观格局演变的特征分析与情景模拟

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员