关于ReLU门训练中重帆事故发生率的经验研究 (An Empirical Study of the Occurrence of Heavy-Tails in Training a ReLU Gate) - 专知论文

会员服务 ·

0

ReLU · 平稳分布 · 建模场景 · 二分类 · binary ·

2022 年 4 月 26 日

An Empirical Study of the Occurrence of Heavy-Tails in Training a ReLU Gate

翻译：关于ReLU门训练中重帆事故发生率的经验研究

Sayar Karmakar,Anirbit Mukherjee

from arxiv, This short note demonstrates some further interesting properties of the key Algorithm 2 of https://doi.org/10.1016/j.neunet.2022.03.040 (arXiv:2005.04211)

A particular direction of recent advance about stochastic deep-learning algorithms has been about uncovering a rather mysterious heavy-tailed nature of the stationary distribution of these algorithms, even when the data distribution is not so. Moreover, the heavy-tail index is known to show interesting dependence on the input dimension of the net, the mini-batch size and the step size of the algorithm. In this short note, we undertake an experimental study of this index for S.G.D. while training a $\relu$ gate (in the realizable and in the binary classification setup) and for a variant of S.G.D. that was proven in Karmakar and Mukherjee (2022) for ReLU realizable data. From our experiments we conjecture that these two algorithms have similar heavy-tail behaviour on any data where the latter can be proven to converge. Secondly, we demonstrate that the heavy-tail index of the late time iterates in this model scenario has strikingly different properties than either what has been proven for linear hypothesis classes or what has been previously demonstrated for large nets.

翻译：最近关于深层学习算法的一个特别进展方向是发现这些算法的固定分布具有相当神秘的重尾性质,即使数据分布并非如此。此外,据了解,重尾指数显示了对网输入层面、微型批量尺寸和算法的步数大小的令人感兴趣的依赖。在这个简短的注释中,我们对S.G.D.的这个指数进行了一项实验性研究,同时训练了一个$\reluu$的门(可实现和二元分类设置),并训练了一个在Karmakar和Mukherjee(2022年)所证明的可实现数据的S.G.D.变式。我们从我们的实验中推测,这两种算法在任何数据上都具有类似的重尾行行为,而后一种数据可以证明是汇合的。第二,我们证明,这一模型情景中它晚期的重尾数指数的特性与线性假设类所证明的或以前为大网所证明的特性截然不同。

0

相关内容

ReLU

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

激光增材制造中金属熔池固液耦合机制及材料热力本构关系

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

N-乙酰葡萄糖胺增强TRAIL诱导的非小细胞肺癌凋亡的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

S100A4-miR155在肝癌组织间充质干细胞调控肝癌增殖及转移中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

荒漠绿洲区景观格局与生态水文耦合及调控

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

导向肽介导光学分子影像靶向诊断膀胱肿瘤的实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

翻译后修饰CREB-1阻断TGF-β1介导的实验性肝纤维化

国家自然科学基金

0+阅读 · 2011年12月31日

NOX在UVA诱导的肥大细胞胞浆钙振荡中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

Hidden Influences of Crowd Behavior in Crowdfunding: An Experimental Study

Arxiv

0+阅读 · 2022年6月14日

Stability and Generalization of Stochastic Optimization with Nonconvex and Nonsmooth Problems

Arxiv

0+阅读 · 2022年6月14日

Box constraints and weighted sparsity regularization for identifying sources in elliptic PDEs

Arxiv

0+阅读 · 2022年6月13日

Deep Neural Network Based Accelerated Failure Time Models using Rank Loss

Arxiv

0+阅读 · 2022年6月13日

Learning through atypical "phase transitions" in overparameterized neural networks

Arxiv

0+阅读 · 2022年6月11日

Empirical Bayes approach to Truth Discovery problems

Arxiv

0+阅读 · 2022年6月9日

On Neural Differential Equations

Arxiv

23+阅读 · 2022年2月4日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

操作系统智能体：基于多模态大模型（MLLM）的通用计算设备智能体综述

《美国太空军系统全生命周期建模、仿真与分析效能提升方案》最新84页报告

【博士论文】推进数据高效的深度学习：非参数 Transformer、主动测试与上下文学习

自主人工智能：未来战争是否将是自主化的？

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Hidden Influences of Crowd Behavior in Crowdfunding: An Experimental Study

Arxiv

0+阅读 · 2022年6月14日

Stability and Generalization of Stochastic Optimization with Nonconvex and Nonsmooth Problems

Arxiv

0+阅读 · 2022年6月14日

Box constraints and weighted sparsity regularization for identifying sources in elliptic PDEs

Arxiv

0+阅读 · 2022年6月13日

Deep Neural Network Based Accelerated Failure Time Models using Rank Loss

Arxiv

0+阅读 · 2022年6月13日

Learning through atypical "phase transitions" in overparameterized neural networks

Arxiv

0+阅读 · 2022年6月11日

Empirical Bayes approach to Truth Discovery problems

Arxiv

0+阅读 · 2022年6月9日

On Neural Differential Equations

Arxiv

23+阅读 · 2022年2月4日

A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP

Arxiv

12+阅读 · 2021年8月30日

The Principles of Deep Learning Theory

Arxiv

65+阅读 · 2021年6月18日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

相关基金

激光增材制造中金属熔池固液耦合机制及材料热力本构关系

国家自然科学基金

0+阅读 · 2014年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

N-乙酰葡萄糖胺增强TRAIL诱导的非小细胞肺癌凋亡的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

S100A4-miR155在肝癌组织间充质干细胞调控肝癌增殖及转移中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

荒漠绿洲区景观格局与生态水文耦合及调控

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

小样本空间制图

国家自然科学基金

0+阅读 · 2012年12月31日

导向肽介导光学分子影像靶向诊断膀胱肿瘤的实验研究

国家自然科学基金

0+阅读 · 2011年12月31日

翻译后修饰CREB-1阻断TGF-β1介导的实验性肝纤维化

国家自然科学基金

0+阅读 · 2011年12月31日

NOX在UVA诱导的肥大细胞胞浆钙振荡中的作用

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员