正规化路径:平行RLU网络的中央和分级吸引常规化 (Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks) - 专知论文

会员服务 ·

0

正则化项 · ReLU · 特化 · 全局优化 · 优化器 ·

2021 年 10 月 18 日

Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks

翻译：正规化路径:平行RLU网络的中央和分级吸引常规化

Tolga Ergen,Mert Pilanci

from arxiv, Accepted to NeurIPS 2021. arXiv admin note: text overlap with arXiv:2110.05518

Despite several attempts, the fundamental mechanisms behind the success of deep neural networks still remain elusive. To this end, we introduce a novel analytic framework to unveil hidden convexity in training deep neural networks. We consider a parallel architecture with multiple ReLU sub-networks, which includes many standard deep architectures and ResNets as its special cases. We then show that the training problem with path regularization can be cast as a single convex optimization problem in a high-dimensional space. We further prove that the equivalent convex program is regularized via a group sparsity inducing norm. Thus, a path regularized parallel architecture with ReLU sub-networks can be viewed as a parsimonious feature selection method in high-dimensions. More importantly, we show that the computational complexity required to globally optimize the equivalent convex problem is polynomial-time with respect to the number of data samples and feature dimension. Therefore, we prove exact polynomial-time trainability for path regularized deep ReLU networks with global optimality guarantees. We also provide several numerical experiments corroborating our theory.

翻译：尽管进行了几次尝试,但深神经网络成功背后的基本机制仍然难以找到。为此,我们引入了一个新的分析框架,以揭开深神经网络培训中隐藏的共性。我们把多个RELU子网络的平行结构视为其特例,其中包括许多标准的深层建筑和ResNet。然后我们表明,路径正规化的培训问题可以作为一个高维空间的单一锥形优化问题来呈现。我们进一步证明,相当的锥形程序通过一个群集聚性诱导规范而正规化。因此,与RELU子网络的路径正规化平行结构可以被视为高二元中一种相似的特征选择方法。更重要的是,我们表明,全球优化等同的锥形问题所需的计算复杂性在数据样本数量和特征方面是多元时的。因此,我们证明,对具有全球最佳性保障的正统化深RELU网络路径的精确多元时训练。我们还提供了数个数字实验,以证实我们理论的理论。

0

相关内容

正则化项

【2021新书】高阶网络，150页pdf，Higher-Order Networks

【2021新书】高阶网络，150页pdf，Higher-Order Networks

专知会员服务

88+阅读 · 2021年11月26日

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

【KDD2020】最小方差采样用于图神经网络的快速训练

【KDD2020】最小方差采样用于图神经网络的快速训练

专知会员服务

28+阅读 · 2020年7月13日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

专知会员服务

28+阅读 · 2020年3月11日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

已删除

将门创投

3+阅读 · 2019年11月25日

LibRec 精选：你见过最有趣的论文标题是什么？

LibRec 精选：你见过最有趣的论文标题是什么？

LibRec智能推荐

4+阅读 · 2019年11月6日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

PRL导读-2018年120卷15期

PRL导读-2018年120卷15期

中科院物理所

4+阅读 · 2018年4月23日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

专知

7+阅读 · 2018年4月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients

Arxiv

0+阅读 · 2021年12月15日

Unsupervised feature selection via self-paced learning and low-redundant regularization

Arxiv

0+阅读 · 2021年12月14日

VPVnet: a velocity-pressure-vorticity neural network method for the Stokes' equations under reduced regularity

Arxiv

0+阅读 · 2021年12月14日

Convergence proof for stochastic gradient descent in the training of deep neural networks with ReLU activation for constant target functions

Arxiv

0+阅读 · 2021年12月13日

Latency-Aware Multi-antenna SWIPT System with Battery-Constrained Receivers

Arxiv

0+阅读 · 2021年12月10日

Momentum Residual Neural Networks

Arxiv

7+阅读 · 2021年5月13日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Towards Understanding Regularization in Batch Normalization

Towards Understanding Regularization in Batch Normalization

Arxiv

4+阅读 · 2018年9月27日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

VIP会员

文章信息

相关主题

相关VIP内容

【2021新书】高阶网络，150页pdf，Higher-Order Networks

【2021新书】高阶网络，150页pdf，Higher-Order Networks

专知会员服务

88+阅读 · 2021年11月26日

《算法凸几何》简明书，Algorithmic Convex Geometry，50页pdf

专知会员服务

42+阅读 · 2021年4月2日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

【经典书】图理论与应用，270页pdf

专知会员服务

86+阅读 · 2020年12月5日

【KDD2020】最小方差采样用于图神经网络的快速训练

【KDD2020】最小方差采样用于图神经网络的快速训练

专知会员服务

28+阅读 · 2020年7月13日

【Google】平滑对抗训练，Smooth Adversarial Training

【Google】平滑对抗训练，Smooth Adversarial Training

专知会员服务

49+阅读 · 2020年7月4日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

250+阅读 · 2020年4月19日

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

【MIT】对抗鲁棒性的流形正则化，Manifold Regularization for Adversarial Robustness

专知会员服务

28+阅读 · 2020年3月11日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机战争时代的战时法：大国竞争中的区分原则、相称性原则与行动建议》最新75页

《构建强健军事力量的设计挑战：提升海军兵力支持系统效能的多分辨率建模方法》69页

正视无人机心理战：恐惧效应与战略反思

《精确反蜂群防御系统：三维运动探测与定向空爆拦截技术融合》最新24页

相关资讯

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

已删除

将门创投

3+阅读 · 2019年11月25日

LibRec 精选：你见过最有趣的论文标题是什么？

LibRec 精选：你见过最有趣的论文标题是什么？

LibRec智能推荐

4+阅读 · 2019年11月6日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

PRL导读-2018年120卷15期

PRL导读-2018年120卷15期

中科院物理所

4+阅读 · 2018年4月23日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

【论文推荐】最新八篇强化学习相关论文—残差网络、QMIX、元学习、动态速率分配、分层强化学习、抽象概况、快速物体检测、SOM

专知

7+阅读 · 2018年4月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients

Arxiv

0+阅读 · 2021年12月15日

Unsupervised feature selection via self-paced learning and low-redundant regularization

Arxiv

0+阅读 · 2021年12月14日

VPVnet: a velocity-pressure-vorticity neural network method for the Stokes' equations under reduced regularity

Arxiv

0+阅读 · 2021年12月14日

Convergence proof for stochastic gradient descent in the training of deep neural networks with ReLU activation for constant target functions

Arxiv

0+阅读 · 2021年12月13日

Latency-Aware Multi-antenna SWIPT System with Battery-Constrained Receivers

Arxiv

0+阅读 · 2021年12月10日

Momentum Residual Neural Networks

Arxiv

7+阅读 · 2021年5月13日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks

Arxiv

13+阅读 · 2020年6月24日

Towards Understanding Regularization in Batch Normalization

Towards Understanding Regularization in Batch Normalization

Arxiv

4+阅读 · 2018年9月27日

Reducing Parameter Space for Neural Network Training

Arxiv

3+阅读 · 2018年8月17日

微信扫码咨询专知VIP会员