通过合并初始化环绕神经网络宽度 -- -- 最坏案例分析 (Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis) - 专知论文

会员服务 ·

0

Analysis · Neural Networks · 相互独立的 · Weight · Networking ·

2022 年 6 月 26 日

Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis

翻译：通过合并初始化环绕神经网络宽度 -- -- 最坏案例分析

Alexander Munteanu,Simon Omlor,Zhao Song,David P. Woodruff

from arxiv, ICML 2022

A common method in training neural networks is to initialize all the weights to be independent Gaussian vectors. We observe that by instead initializing the weights into independent pairs, where each pair consists of two identical Gaussian vectors, we can significantly improve the convergence analysis. While a similar technique has been studied for random inputs [Daniely, NeurIPS 2020], it has not been analyzed with arbitrary inputs. Using this technique, we show how to significantly reduce the number of neurons required for two-layer ReLU networks, both in the under-parameterized setting with logistic loss, from roughly $\gamma^{-8}$ [Ji and Telgarsky, ICLR 2020] to $\gamma^{-2}$, where $\gamma$ denotes the separation margin with a Neural Tangent Kernel, as well as in the over-parameterized setting with squared loss, from roughly $n^4$ [Song and Yang, 2019] to $n^2$, implicitly also improving the recent running time bound of [Brand, Peng, Song and Weinstein, ITCS 2021]. For the under-parameterized setting we also prove new lower bounds that improve upon prior work, and that under certain assumptions, are best possible.

翻译：在神经网络培训中,一个常见的方法是将所有重量初始化为独立的高斯矢量。我们观察到,通过将重量初始化为独立的双对,每对由两个相同的高斯矢量组成,我们可以大大改进趋同分析。虽然对随机输入进行了类似的技术研究[Daniely, NeurIPS 2020],但还没有用任意输入来分析这种技术。使用这一技术,我们展示了如何将双层ReLU网络所需的神经元数量大幅减少,无论是在后勤损失不足的分计环境下,从大约$\gamma ⁇ -8}[Ji和Telgarsky, ICLR 美元,到$\gamma ⁇ -2],在随机输入[Damilt Tennell, Neuralnel 和以平方损失的超标定设置中,从大约$4美元[Song and Yang,20199]到2美元,还隐含地改进了最近一段时间(Brand, Peng, Song and Weinstest)的约束, 在202号假设之下,也得到了最佳的改进。

0

相关内容

Analysis

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

多秩信号稳健宽线性波束形成方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

具有荧光成像功能磁共振成像造影剂的合成及作为药物靶向制剂的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

基于核方法的非局部图像处理

国家自然科学基金

0+阅读 · 2012年12月31日

基于超图谱分析的图像分类方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于MAP的低复杂度LDPC译码算法理论和方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Co基Heusler合金半金属磁性隧道结界面特性及电子极化输运性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

Notes on Worst-case Inefficiency of Gradient Descent Even in R^2

Arxiv

0+阅读 · 2022年8月15日

SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding

Arxiv

0+阅读 · 2022年8月15日

Preventing Deterioration of Classification Accuracy in Predictive Coding Networks

Arxiv

0+阅读 · 2022年8月15日

Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks

Arxiv

0+阅读 · 2022年8月15日

PAC Generalization via Invariant Representations

Arxiv

0+阅读 · 2022年8月15日

The SVD of Convolutional Weights: A CNN Interpretability Framework

Arxiv

0+阅读 · 2022年8月14日

Similarity of Sentence Representations in Multilingual LMs: Resolving Conflicting Literature and Case Study of Baltic Languages

Arxiv

0+阅读 · 2022年8月14日

Multiple RISs Assisted Cell-Free Networks With Two-timescale CSI: Performance Analysis and System Design

Arxiv

0+阅读 · 2022年8月12日

Counterfactual Zero-Shot and Open-Set Visual Recognition

Arxiv

12+阅读 · 2021年3月1日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Arxiv

18+阅读 · 2021年1月28日

VIP会员

文章信息

相关主题

Neural Networks

相互独立的

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《北约战术级军事后勤单元弹性影响因素分析：如何在战术级单元建立弹性以提高大型作战行动中的生存能力》2025最新400页

《2025年空天与国防领域的新兴趋势：驾驭创新与韧性的新时代》最新40页报告

中文版 | 模块化杀伤力：美海军探索集装箱式杀伤载荷

中文版 | 国防领域中的数字孪生：提升决策能力与战备状态

相关资讯

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

相关论文

Notes on Worst-case Inefficiency of Gradient Descent Even in R^2

Arxiv

0+阅读 · 2022年8月15日

SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding

Arxiv

0+阅读 · 2022年8月15日

Preventing Deterioration of Classification Accuracy in Predictive Coding Networks

Arxiv

0+阅读 · 2022年8月15日

Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks

Arxiv

0+阅读 · 2022年8月15日

PAC Generalization via Invariant Representations

Arxiv

0+阅读 · 2022年8月15日

The SVD of Convolutional Weights: A CNN Interpretability Framework

Arxiv

0+阅读 · 2022年8月14日

Similarity of Sentence Representations in Multilingual LMs: Resolving Conflicting Literature and Case Study of Baltic Languages

Arxiv

0+阅读 · 2022年8月14日

Multiple RISs Assisted Cell-Free Networks With Two-timescale CSI: Performance Analysis and System Design

Arxiv

0+阅读 · 2022年8月12日

Counterfactual Zero-Shot and Open-Set Visual Recognition

Arxiv

12+阅读 · 2021年3月1日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Arxiv

18+阅读 · 2021年1月28日

相关基金

Copine VII在阿尔茨海默病中的作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

罗巴代数的表示和罗巴代数在operad中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

多秩信号稳健宽线性波束形成方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

具有荧光成像功能磁共振成像造影剂的合成及作为药物靶向制剂的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

基于核方法的非局部图像处理

国家自然科学基金

0+阅读 · 2012年12月31日

基于超图谱分析的图像分类方法研究

国家自然科学基金

1+阅读 · 2012年12月31日

高效中红外激光晶体Cr,Er,Re:YSGG（Re＝Eu3+, Tb3+）的生长及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于MAP的低复杂度LDPC译码算法理论和方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

Co基Heusler合金半金属磁性隧道结界面特性及电子极化输运性质研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员