CNN架构的隐含 Convex 正规化器:在多边时代,Convex优化两层和三层网络 (Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time) - 专知论文

会员服务 ·

0

正则化项 · 优化器 · ReLU · 卷积神经网络 · Extensibility ·

2021 年 3 月 18 日

Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time

翻译：CNN架构的隐含 Convex 正规化器:在多边时代,Convex优化两层和三层网络

Tolga Ergen,Mert Pilanci

from arxiv, Accepted for Spotlight Presentation at ICLR 2021

We study training of Convolutional Neural Networks (CNNs) with ReLU activations and introduce exact convex optimization formulations with a polynomial complexity with respect to the number of data samples, the number of neurons, and data dimension. More specifically, we develop a convex analytic framework utilizing semi-infinite duality to obtain equivalent convex optimization problems for several two- and three-layer CNN architectures. We first prove that two-layer CNNs can be globally optimized via an $\ell_2$ norm regularized convex program. We then show that multi-layer circular CNN training problems with a single ReLU layer are equivalent to an $\ell_1$ regularized convex program that encourages sparsity in the spectral domain. We also extend these results to three-layer CNNs with two ReLU layers. Furthermore, we present extensions of our approach to different pooling methods, which elucidates the implicit architectural bias as convex regularizers.

翻译：我们研究革命神经网络(CNNs)的ReLU激活,并引进精确的共振优化配方,在数据样本数量、神经元数量和数据维度方面具有多元复杂性。更具体地说,我们开发一个共振分析框架,利用半无穷的双重性,为多个两层和三层CNN架构获得等效的共振优化问题。我们首先证明,两层CNN可以通过一个$\ell_2美元规范的正规化convex程序在全球优化。我们然后显示,单层RELU多层的多层CNN培训问题相当于一个鼓励光谱域中宽度的多层CNN培训问题。我们还将这些结果推广到两个RU层次的三层CNN。此外,我们介绍了我们采用不同组合方法的延伸,该方法将隐含的建筑偏向作为Convex规范者加以解释。

0

相关内容

正则化项

【ICML 2020】设置LayerNorm使Transformer加速收敛

专知会员服务

16+阅读 · 2020年7月27日

【CVPR2020-清华大学】渐进对抗网络的细粒度域适应，Progressive Adversarial Networks

专知会员服务

27+阅读 · 2020年4月4日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

专知会员服务

16+阅读 · 2019年11月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

目标检测中的Consistent Optimization

目标检测中的Consistent Optimization

极市平台

6+阅读 · 2019年4月23日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Lipschitz Normalization for Self-Attention Layers with Application to Graph Neural Networks

Arxiv

0+阅读 · 2021年5月12日

Convergence Analysis of Over-parameterized Deep Linear Networks, and the Principal Components Bias

Convergence Analysis of Over-parameterized Deep Linear Networks, and the Principal Components Bias

Arxiv

0+阅读 · 2021年5月12日

LipBaB: Computing exact Lipschitz constant of ReLU networks

Arxiv

0+阅读 · 2021年5月12日

An efficient projection neural network for $\ell_1$-regularized logistic regression

Arxiv

0+阅读 · 2021年5月12日

Global Convergence of Three-layer Neural Networks in the Mean Field Regime

Arxiv

0+阅读 · 2021年5月11日

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

Arxiv

0+阅读 · 2021年5月4日

Graph Neural Networks: Architectures, Stability and Transferability

Arxiv

13+阅读 · 2020年8月4日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Neural Architecture Optimization

Neural Architecture Optimization

Arxiv

8+阅读 · 2018年9月5日

VIP会员

文章信息

相关主题

卷积神经网络

相关VIP内容

【ICML 2020】设置LayerNorm使Transformer加速收敛

专知会员服务

16+阅读 · 2020年7月27日

【CVPR2020-清华大学】渐进对抗网络的细粒度域适应，Progressive Adversarial Networks

专知会员服务

27+阅读 · 2020年4月4日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

【ICCV 2019】贝叶斯优化的1-Bit CNNs 《Bayesian Optimized 1-Bit CNNs》

专知会员服务

16+阅读 · 2019年11月17日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】扩展可扩展会话推荐的边界

别想太多：高效 R1 风格大型推理模型综述

【ACMMM2025】EvoVLMA: 进化式视觉-语言模型自适应

智能体网络：用AI智能体编织下一代网络

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

目标检测中的Consistent Optimization

目标检测中的Consistent Optimization

极市平台

6+阅读 · 2019年4月23日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Ray RLlib: Scalable 降龙十八掌

Ray RLlib: Scalable 降龙十八掌

CreateAMind

9+阅读 · 2018年12月28日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Lipschitz Normalization for Self-Attention Layers with Application to Graph Neural Networks

Arxiv

0+阅读 · 2021年5月12日

Convergence Analysis of Over-parameterized Deep Linear Networks, and the Principal Components Bias

Convergence Analysis of Over-parameterized Deep Linear Networks, and the Principal Components Bias

Arxiv

0+阅读 · 2021年5月12日

LipBaB: Computing exact Lipschitz constant of ReLU networks

Arxiv

0+阅读 · 2021年5月12日

An efficient projection neural network for $\ell_1$-regularized logistic regression

Arxiv

0+阅读 · 2021年5月12日

Global Convergence of Three-layer Neural Networks in the Mean Field Regime

Arxiv

0+阅读 · 2021年5月11日

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

Arxiv

0+阅读 · 2021年5月4日

Graph Neural Networks: Architectures, Stability and Transferability

Arxiv

13+阅读 · 2020年8月4日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

16+阅读 · 2020年3月30日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

Neural Architecture Optimization

Neural Architecture Optimization

Arxiv

8+阅读 · 2018年9月5日

微信扫码咨询专知VIP会员