浅 ReLU 平方损失和正对角输入网络的渐变流动态 (Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs) - 专知论文

会员服务 ·

0

平方损失 · 正交 · 方阵 · ReLU · Networking ·

2022 年 6 月 2 日

Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs

翻译：浅 ReLU 平方损失和正对角输入网络的渐变流动态

Etienne Boursier,Loucas Pillaud-Vivien,Nicolas Flammarion

The training of neural networks by gradient descent methods is a cornerstone of the deep learning revolution. Yet, despite some recent progress, a complete theory explaining its success is still missing. This article presents, for orthogonal input vectors, a precise description of the gradient flow dynamics of training one-hidden layer ReLU neural networks for the mean squared error at small initialisation. In this setting, despite non-convexity, we show that the gradient flow converges to zero loss and characterise its implicit bias towards minimum variation norm. Furthermore, some interesting phenomena are highlighted: a quantitative description of the initial alignment phenomenon and a proof that the process follows a specific saddle to saddle dynamics.

翻译：以梯度下降法对神经网络进行培训是深层学习革命的基石。然而,尽管最近取得了一些进展,但解释其成功与否的完整理论仍然缺乏。对于正向输入矢量而言,这一条准确地描述了在小初始化时对单层顶层ReLU神经网络进行平均平方错误培训的梯度流动动态。在这种背景下,尽管非混凝土,我们还是表明,梯度流会达到零损失,并表明其隐含的偏向于最小变异规范。此外,一些有趣的现象也得到了强调:对初始匹配现象的量化描述,以及证明这一过程遵循了特定的马鞍动力。

0

相关内容

平方损失

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

湖北麦冬均一多糖由PPARγ信号通路介导的降血脂作用及其机制的研究

国家自然科学基金

0+阅读 · 2015年12月31日

青钱柳三萜调控SIRT1/NF-κB信号通路干预肥胖2型糖尿病的效应及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

可见光响应型Cu2O/Bi2WO6催化剂的构筑及光催化降解SCFA制取氢气和烷烃的机理

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

新的小分子化合物WJ460通过靶向Myoferlin抑制乳腺癌转移和复发的分子机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

两亲性二氧化硅纳米片的仿生合成

国家自然科学基金

0+阅读 · 2013年12月31日

功能化的氧化石墨烯诱导沸石合成及表面负载

国家自然科学基金

0+阅读 · 2013年12月31日

动态和多元非参数控制图的研究与应用

国家自然科学基金

0+阅读 · 2012年12月31日

离子液体基功能化Janus纳米片的设计、制备与应用

国家自然科学基金

0+阅读 · 2012年12月31日

复合稀土层状氢氧化物的可控合成、剥离及透明荧光取向膜的纳米片组装与光学特性

国家自然科学基金

0+阅读 · 2011年12月31日

Multi-parametric Analysis for Mixed Integer Linear Programming: An Application to Transmission Planning and Congestion Control

Arxiv

0+阅读 · 2022年7月19日

Lazy Estimation of Variable Importance for Large Neural Networks

Arxiv

0+阅读 · 2022年7月19日

On the Study of Sample Complexity for Polynomial Neural Networks

Arxiv

0+阅读 · 2022年7月18日

LaSDI: Parametric Latent Space Dynamics Identification

Arxiv

0+阅读 · 2022年7月18日

On the difficulty of learning chaotic dynamics with RNNs

Arxiv

0+阅读 · 2022年7月18日

Low Rank Approximation for General Tensor Networks

Arxiv

0+阅读 · 2022年7月15日

Lipschitz Bound Analysis of Neural Networks

Arxiv

0+阅读 · 2022年7月14日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

The Confluence of Networks, Games and Learning

Arxiv

94+阅读 · 2021年5月17日

Additive Margin Softmax for Face Verification

Arxiv

11+阅读 · 2018年1月18日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

一份简单《图神经网络》教程，28页ppt

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Multi-parametric Analysis for Mixed Integer Linear Programming: An Application to Transmission Planning and Congestion Control

Arxiv

0+阅读 · 2022年7月19日

Lazy Estimation of Variable Importance for Large Neural Networks

Arxiv

0+阅读 · 2022年7月19日

On the Study of Sample Complexity for Polynomial Neural Networks

Arxiv

0+阅读 · 2022年7月18日

LaSDI: Parametric Latent Space Dynamics Identification

Arxiv

0+阅读 · 2022年7月18日

On the difficulty of learning chaotic dynamics with RNNs

Arxiv

0+阅读 · 2022年7月18日

Low Rank Approximation for General Tensor Networks

Arxiv

0+阅读 · 2022年7月15日

Lipschitz Bound Analysis of Neural Networks

Arxiv

0+阅读 · 2022年7月14日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

The Confluence of Networks, Games and Learning

Arxiv

94+阅读 · 2021年5月17日

Additive Margin Softmax for Face Verification

Arxiv

11+阅读 · 2018年1月18日

相关基金

湖北麦冬均一多糖由PPARγ信号通路介导的降血脂作用及其机制的研究

国家自然科学基金

0+阅读 · 2015年12月31日

青钱柳三萜调控SIRT1/NF-κB信号通路干预肥胖2型糖尿病的效应及机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

可见光响应型Cu2O/Bi2WO6催化剂的构筑及光催化降解SCFA制取氢气和烷烃的机理

国家自然科学基金

0+阅读 · 2014年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

新的小分子化合物WJ460通过靶向Myoferlin抑制乳腺癌转移和复发的分子机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

两亲性二氧化硅纳米片的仿生合成

国家自然科学基金

0+阅读 · 2013年12月31日

功能化的氧化石墨烯诱导沸石合成及表面负载

国家自然科学基金

0+阅读 · 2013年12月31日

动态和多元非参数控制图的研究与应用

国家自然科学基金

0+阅读 · 2012年12月31日

离子液体基功能化Janus纳米片的设计、制备与应用

国家自然科学基金

0+阅读 · 2012年12月31日

复合稀土层状氢氧化物的可控合成、剥离及透明荧光取向膜的纳米片组装与光学特性

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员