使用线性宽度的无线- 面 ReLU 隐形网络 (Gradient Descent Optimizes Infinite-Depth ReLU Implicit Networks with Linear Widths) - 专知论文

会员服务 ·

0

通用动力公司 · Networking · 线性的 · ReLU · 宽度 ·

2022 年 5 月 16 日

Gradient Descent Optimizes Infinite-Depth ReLU Implicit Networks with Linear Widths

翻译：使用线性宽度的无线- 面 ReLU 隐形网络

Tianxiang Gao,Hongyang Gao

Implicit deep learning has recently become popular in the machine learning community since these implicit models can achieve competitive performance with state-of-the-art deep networks while using significantly less memory and computational resources. However, our theoretical understanding of when and how first-order methods such as gradient descent (GD) converge on \textit{nonlinear} implicit networks is limited. Although this type of problem has been studied in standard feed-forward networks, the case of implicit models is still intriguing because implicit networks have \textit{infinitely} many layers. The corresponding equilibrium equation probably admits no or multiple solutions during training. This paper studies the convergence of both gradient flow (GF) and gradient descent for nonlinear ReLU activated implicit networks. To deal with the well-posedness problem, we introduce a fixed scalar to scale the weight matrix of the implicit layer and show that there exists a small enough scaling constant, keeping the equilibrium equation well-posed throughout training. As a result, we prove that both GF and GD converge to a global minimum at a linear rate if the width $m$ of the implicit network is \textit{linear} in the sample size $N$, i.e., $m=\Omega(N)$.

翻译：由于这些隐含的深层学习在机器学习界最近变得很受欢迎,因为这些隐含的模型可以在使用大量记忆和计算资源的同时,通过最先进的深层网络取得竞争性的绩效。然而,我们对梯度下降(GD)等第一阶方法何时以及如何在\ textit{nonlinear}隐含网络上趋同的理论理解是有限的。虽然在标准的进料前进网络中已经研究过这类类型的问题,但隐含的模型仍然令人感兴趣,因为隐含的网络有许多层次。相应的平衡方程式在培训期间可能没有接受任何或多种解决办法。本文研究了非线性ReLU激活的隐含网络的梯度流动(GF)和梯度下降的趋同程度。为了处理隐含层的重力问题,我们引入了固定的斜度,以缩小隐含层的权重矩阵,并表明在培训期间保持平衡方程式。结果证明,如果隐含网络的宽度为$美元,则GF和GD均为全球最低线率。

0

相关内容

通用动力公司

通用动力公司

通用动力公司（General Dynamics）是一家美国的国防企业集团。2008年时通用动力是世界第五大国防工业承包商。由于近年来不断的扩充和并购其他公司，通用动力现今的组成与面貌已与冷战时期时大不相同。现今通用动力包含三大业务集团：海洋、作战系统和资讯科技集团。

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

湘西寒武纪奥斯坦型保存化石的研究

国家自然科学基金

0+阅读 · 2015年12月31日

IIB族金属纳米结构气相生长机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

黎曼流形的曲率与拓扑关系研究

国家自然科学基金

2+阅读 · 2013年12月31日

等离子体对有机发光材料的表面改性研究

国家自然科学基金

0+阅读 · 2011年12月31日

高密度、快衰减非对称体系闪烁透明陶瓷的制备和性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

局域结构可控的Nd：AeF2（Ae=Ca，Sr，Ba）激光晶体的研究

国家自然科学基金

0+阅读 · 2011年12月31日

高介电常数栅介质薄膜可控生长机理、微观结构与电学性质的理论研究

国家自然科学基金

0+阅读 · 2009年12月31日

电致发光材料的电子结构和分子设计研究

国家自然科学基金

0+阅读 · 2009年12月31日

LuSiO5:Ce新型透明薄膜微结构的调控及其闪烁性能

国家自然科学基金

0+阅读 · 2009年12月31日

CaCu3Ti4O12基微/纳米陶瓷的制备与介电性能调控

国家自然科学基金

0+阅读 · 2008年12月31日

Polynomial-Time Optimal Equilibria with a Mediator in Extensive-Form Games

Arxiv

0+阅读 · 2022年7月6日

Complexity of optimizing over the integers

Arxiv

0+阅读 · 2022年7月5日

High-Dimensional Private Empirical Risk Minimization by Greedy Coordinate Descent

Arxiv

0+阅读 · 2022年7月4日

Near-Optimal No-Regret Learning for Correlated Equilibria in Multi-Player General-Sum Games

Arxiv

0+阅读 · 2022年7月4日

On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes

Arxiv

0+阅读 · 2022年7月4日

On Convergence of Gradient Descent Ascent: A Tight Local Analysis

Arxiv

0+阅读 · 2022年7月3日

Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation

Arxiv

0+阅读 · 2022年7月2日

ReLU Deep Neural Networks from the Hierarchical Basis Perspective

ReLU Deep Neural Networks from the Hierarchical Basis Perspective

Arxiv

0+阅读 · 2022年7月1日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

VIP会员

文章信息

相关主题

通用动力公司

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium4

中国图象图形学学会CSIG

0+阅读 · 2021年11月10日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Polynomial-Time Optimal Equilibria with a Mediator in Extensive-Form Games

Arxiv

0+阅读 · 2022年7月6日

Complexity of optimizing over the integers

Arxiv

0+阅读 · 2022年7月5日

High-Dimensional Private Empirical Risk Minimization by Greedy Coordinate Descent

Arxiv

0+阅读 · 2022年7月4日

Near-Optimal No-Regret Learning for Correlated Equilibria in Multi-Player General-Sum Games

Arxiv

0+阅读 · 2022年7月4日

On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes

Arxiv

0+阅读 · 2022年7月4日

On Convergence of Gradient Descent Ascent: A Tight Local Analysis

Arxiv

0+阅读 · 2022年7月3日

Object Representations as Fixed Points: Training Iterative Refinement Algorithms with Implicit Differentiation

Arxiv

0+阅读 · 2022年7月2日

ReLU Deep Neural Networks from the Hierarchical Basis Perspective

ReLU Deep Neural Networks from the Hierarchical Basis Perspective

Arxiv

0+阅读 · 2022年7月1日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

相关基金

湘西寒武纪奥斯坦型保存化石的研究

国家自然科学基金

0+阅读 · 2015年12月31日

IIB族金属纳米结构气相生长机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

黎曼流形的曲率与拓扑关系研究

国家自然科学基金

2+阅读 · 2013年12月31日

等离子体对有机发光材料的表面改性研究

国家自然科学基金

0+阅读 · 2011年12月31日

高密度、快衰减非对称体系闪烁透明陶瓷的制备和性能研究

国家自然科学基金

0+阅读 · 2011年12月31日

局域结构可控的Nd：AeF2（Ae=Ca，Sr，Ba）激光晶体的研究

国家自然科学基金

0+阅读 · 2011年12月31日

高介电常数栅介质薄膜可控生长机理、微观结构与电学性质的理论研究

国家自然科学基金

0+阅读 · 2009年12月31日

电致发光材料的电子结构和分子设计研究

国家自然科学基金

0+阅读 · 2009年12月31日

LuSiO5:Ce新型透明薄膜微结构的调控及其闪烁性能

国家自然科学基金

0+阅读 · 2009年12月31日

CaCu3Ti4O12基微/纳米陶瓷的制备与介电性能调控

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员