最大输出网络和参数初始化后果的预期梯度 (Expected Gradients of Maxout Networks and Consequences to Parameter Initialization) - 专知论文

会员服务 ·

0

Maxout · Networking · 矩 · Networks · 雅克比 ·

2023 年 1 月 17 日

Expected Gradients of Maxout Networks and Consequences to Parameter Initialization

翻译：最大输出网络和参数初始化后果的预期梯度

Hanna Tseran,Guido Montúfar

from arxiv, 37 pages, 8 figures

We study the gradients of a maxout network with respect to inputs and parameters and obtain bounds for the moments depending on the architecture and the parameter distribution. We observe that the distribution of the input-output Jacobian depends on the input, which complicates a stable parameter initialization. Based on the moments of the gradients, we formulate parameter initialization strategies that avoid vanishing and exploding gradients in wide networks. Experiments with deep fully-connected and convolutional networks show that this strategy improves SGD and Adam training of deep maxout networks. In addition, we obtain refined bounds on the expected number of linear regions, results on the expected curve length distortion, and results on the NTK.

翻译：我们根据输入和参数分布研究最大值网络的梯度,并获得根据结构及参数分布所决定的时刻的界限。我们观察到,投入-产出Jacobian的分布取决于输入量,这使得稳定的参数初始化复杂化。根据梯度的瞬间,我们制定参数初始化战略,避免在宽广的网络中消失和爆炸梯度。与完全连接和连动的深网络进行的实验表明,这一战略改善了SGD和Adam对深度最大值网络的培训。此外,我们还获得了线性区域预期数量、预期曲线长度扭曲结果和NTK结果的精细界限。

0

相关内容

Maxout

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

镧锕分离用BTPs/离子液体萃取体系的辐射效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

ROS在缺氧线粒体稳态失衡中的作用和分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

基于MIMO和干扰消除的无线多跳网络组通信算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

基于ForCES的软件定义网络（SDN）研究

国家自然科学基金

1+阅读 · 2012年12月31日

关于图上随机游走、渗流的几个问题

国家自然科学基金

0+阅读 · 2012年12月31日

水分生境多变条件下的河岸植物群落早期演替

国家自然科学基金

0+阅读 · 2011年12月31日

单花山竹子的抗肿瘤活性物质基础研究

国家自然科学基金

0+阅读 · 2010年12月31日

AlGaN基PIN太阳光盲雪崩探测器研究

国家自然科学基金

0+阅读 · 2008年12月31日

Optimal Design of Validation Experiments for the Prediction of Quantities of Interest

Arxiv

0+阅读 · 2023年3月10日

Widely-Linear MMSE Estimation of Complex-Valued Graph Signals

Arxiv

0+阅读 · 2023年3月10日

SHINE: SHaring the INverse Estimate from the forward pass for bi-level optimization and implicit models

Arxiv

0+阅读 · 2023年3月10日

Error bound analysis of the stochastic parareal algorithm

Arxiv

0+阅读 · 2023年3月10日

Data-dependent Generalization Bounds via Variable-Size Compressibility

Arxiv

0+阅读 · 2023年3月9日

The joint node degree distribution in the Erdős-Rényi network

Arxiv

0+阅读 · 2023年3月9日

Some New Results on the Maximum Growth Factor in Gaussian Elimination

Arxiv

0+阅读 · 2023年3月8日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Arxiv

25+阅读 · 2020年6月3日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

VIP会员

文章信息

相关主题

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】低维与高维空间中潜在表征的分析、建模与变换

《生态建模密码破译：建模与编程实践》美陆军最新报告

大模型解决方案白皮书：社交陪伴场景全流程落地指南

面向具身操作的视觉-语言-动作模型综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Optimal Design of Validation Experiments for the Prediction of Quantities of Interest

Arxiv

0+阅读 · 2023年3月10日

Widely-Linear MMSE Estimation of Complex-Valued Graph Signals

Arxiv

0+阅读 · 2023年3月10日

SHINE: SHaring the INverse Estimate from the forward pass for bi-level optimization and implicit models

Arxiv

0+阅读 · 2023年3月10日

Error bound analysis of the stochastic parareal algorithm

Arxiv

0+阅读 · 2023年3月10日

Data-dependent Generalization Bounds via Variable-Size Compressibility

Arxiv

0+阅读 · 2023年3月9日

The joint node degree distribution in the Erdős-Rényi network

Arxiv

0+阅读 · 2023年3月9日

Some New Results on the Maximum Growth Factor in Gaussian Elimination

Arxiv

0+阅读 · 2023年3月8日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Arxiv

25+阅读 · 2020年6月3日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

相关基金

镧锕分离用BTPs/离子液体萃取体系的辐射效应研究

国家自然科学基金

0+阅读 · 2014年12月31日

ROS在缺氧线粒体稳态失衡中的作用和分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

基于MIMO和干扰消除的无线多跳网络组通信算法研究

国家自然科学基金

1+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

基于ForCES的软件定义网络（SDN）研究

国家自然科学基金

1+阅读 · 2012年12月31日

关于图上随机游走、渗流的几个问题

国家自然科学基金

0+阅读 · 2012年12月31日

水分生境多变条件下的河岸植物群落早期演替

国家自然科学基金

0+阅读 · 2011年12月31日

单花山竹子的抗肿瘤活性物质基础研究

国家自然科学基金

0+阅读 · 2010年12月31日

AlGaN基PIN太阳光盲雪崩探测器研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员