围绕本地微型网络,神经网的层级L1损失景观更为复杂 (The layer-wise L1 Loss Landscape of Neural Nets is more complex around local minima) - 专知论文

会员服务 ·

0

局部极小 · 极小值 · 极小点 · 损失 · ReLU ·

2021 年 5 月 6 日

The layer-wise L1 Loss Landscape of Neural Nets is more complex around local minima

翻译：围绕本地微型网络,神经网的层级L1损失景观更为复杂

from arxiv, 4 pages, 5 figures

For fixed training data and network parameters in the other layers the L1 loss of a ReLU neural network as a function of the first layer's parameters is a piece-wise affine function. We use the Deep ReLU Simplex algorithm to iteratively minimize the loss monotonically on adjacent vertices and analyze the trajectory of these vertex positions. We empirically observe that in a neighbourhood around a local minimum, the iterations behave differently such that conclusions on loss level and proximity of the local minimum can be made before it has been found: Firstly the loss seems to decay exponentially slow at iterated adjacent vertices such that the loss level at the local minimum can be estimated from the loss levels of subsequently iterated vertices, and secondly we observe a strong increase of the vertex density around local minima. This could have far-reaching consequences for the design of new gradient-descent algorithms that might improve convergence rate by exploiting these facts.

翻译：对于其他层的固定培训数据和网络参数而言,作为第一层参数函数的ReLU神经网络L1损失是一个小巧的折叠函数。我们使用深 ReLU简单算法迭接地将相邻的脊椎损失单质最小化,并分析这些脊椎位置的轨迹。我们从经验中观察到,在附近一个地方最低点附近,循环行为不同,因此在发现当地最小值之前,可以得出关于损失水平和接近当地最小值的结论:首先,在迭代相邻的脊椎上,损失速度似乎急剧缓慢地下降,因此,从随后的迭接的脊椎损失水平中可以估计当地最低损失水平,其次,我们观察到当地微型脊椎周围的脊椎密度大幅上升。这可能会对设计新的梯度-白值算法产生深远的影响,通过利用这些事实来提高汇合率。

0

相关内容

局部极小

【干货书】鲁棒优化Robust Optimization，570页pdf

专知会员服务

144+阅读 · 2021年3月17日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

13+阅读 · 2019年11月22日

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

专知会员服务

77+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

What can linear interpolation of neural network loss landscapes tell us?

Arxiv

0+阅读 · 2021年6月30日

Deep Linear Networks Dynamics: Low-Rank Biases Induced by Initialization Scale and L2 Regularization

Arxiv

0+阅读 · 2021年6月30日

Local Convergence of an AMP Variant to the LASSO Solution in Finite Dimensions

Arxiv

0+阅读 · 2021年6月30日

On the Landscape of One-hidden-layer Sparse Networks and Beyond

Arxiv

0+阅读 · 2021年6月30日

Generalized Power Method for Generalized Orthogonal Procrustes Problem: Global Convergence and Optimization Landscape Analysis

Arxiv

0+阅读 · 2021年6月29日

Asymptotic Log-Det Sum-of-Ranks Minimization via Tensor (Alternating) Iteratively Reweighted Least Squares

Arxiv

0+阅读 · 2021年6月29日

Central Limit Theorem for product of dependent random variables

Arxiv

0+阅读 · 2021年6月28日

Shortcut Hulls: Vertex-restricted Outer Simplifications of Polygons

Arxiv

0+阅读 · 2021年6月25日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

Investigating and Mitigating Degree-Related Biases in Graph Convolutional Networks

Arxiv

6+阅读 · 2020年8月13日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】鲁棒优化Robust Optimization，570页pdf

专知会员服务

144+阅读 · 2021年3月17日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

【IPAM 】张量主元分析中的高维成本景观和梯度下降及其推广（High-dimensional cost landscape and gradient descent in Tensor PCA and its generalisations），附41页pdf

专知会员服务

13+阅读 · 2019年11月22日

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

253页通俗易懂最新的机器学习系统入门书籍（Machine-Learning-Systems）（附pdf下载）

专知会员服务

77+阅读 · 2019年10月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【伯克利博士论文】通过真实世界实践赋能机器人自主性

军用无人机集群技术尚未成熟——但潜力可期

人工智能安全治理白皮书（2025）

AgentOps综述：分类、挑战与未来方向

相关资讯

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

【论文笔记】通俗理解少样本文本分类 (Few-Shot Text Classification) (1)

深度学习自然语言处理

7+阅读 · 2020年4月8日

动物脑的好奇心和强化学习的好奇心

动物脑的好奇心和强化学习的好奇心

CreateAMind

10+阅读 · 2019年1月26日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

What can linear interpolation of neural network loss landscapes tell us?

Arxiv

0+阅读 · 2021年6月30日

Deep Linear Networks Dynamics: Low-Rank Biases Induced by Initialization Scale and L2 Regularization

Arxiv

0+阅读 · 2021年6月30日

Local Convergence of an AMP Variant to the LASSO Solution in Finite Dimensions

Arxiv

0+阅读 · 2021年6月30日

On the Landscape of One-hidden-layer Sparse Networks and Beyond

Arxiv

0+阅读 · 2021年6月30日

Generalized Power Method for Generalized Orthogonal Procrustes Problem: Global Convergence and Optimization Landscape Analysis

Arxiv

0+阅读 · 2021年6月29日

Asymptotic Log-Det Sum-of-Ranks Minimization via Tensor (Alternating) Iteratively Reweighted Least Squares

Arxiv

0+阅读 · 2021年6月29日

Central Limit Theorem for product of dependent random variables

Arxiv

0+阅读 · 2021年6月28日

Shortcut Hulls: Vertex-restricted Outer Simplifications of Polygons

Arxiv

0+阅读 · 2021年6月25日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

Investigating and Mitigating Degree-Related Biases in Graph Convolutional Networks

Arxiv

6+阅读 · 2020年8月13日

微信扫码咨询专知VIP会员