更多是更少:通过超参数化引引公平</s> (More is Less: Inducing Sparsity via Overparameterization) - 专知论文

会员服务 ·

0

特化 · Less · 向量化 · 泛函 · Neural Networks ·

2023 年 3 月 9 日

More is Less: Inducing Sparsity via Overparameterization

翻译：更多是更少:通过超参数化引引公平

Hung-Hsu Chou,Johannes Maly,Holger Rauhut

In deep learning it is common to overparameterize neural networks, that is, to use more parameters than training samples. Quite surprisingly training the neural network via (stochastic) gradient descent leads to models that generalize very well, while classical statistics would suggest overfitting. In order to gain understanding of this implicit bias phenomenon we study the special case of sparse recovery (compressed sensing) which is of interest on its own. More precisely, in order to reconstruct a vector from underdetermined linear measurements, we introduce a corresponding overparameterized square loss functional, where the vector to be reconstructed is deeply factorized into several vectors. We show that, if there exists an exact solution, vanilla gradient flow for the overparameterized loss functional converges to a good approximation of the solution of minimal $\ell_1$-norm. The latter is well-known to promote sparse solutions. As a by-product, our results significantly improve the sample complexity for compressed sensing via gradient flow/descent on overparameterized models derived in previous works. The theory accurately predicts the recovery rate in numerical experiments. Our proof relies on analyzing a certain Bregman divergence of the flow. This bypasses the obstacles caused by non-convexity and should be of independent interest.

翻译：在深层学习中,过分区分神经网络是常见的,即使用比培训样本更多的参数。相当令人惊讶的是,通过(随机)梯度下降对神经网络进行培训导致模型非常普遍化,而古典统计则表明超称。为了了解这种隐含的偏差现象,我们研究的是稀释恢复(压缩感知)的特殊案例,这本身就令人感兴趣。更确切地说,为了从不确定的线性测量中重建一个矢量,我们引入了一个相应的过分区分的平方损失功能,在这个功能中,要重建的矢量被深入地分解成若干矢量。我们表明,如果存在精确的解决方案,那么过分度损失功能的香草梯度流动就会聚集到一个非常接近的最小的 $ ell_ 1$- 诺尔姆的解决方案上。众所周知, 后者会促进稀释的解决办法。作为一种副产品,我们的结果大大改进了通过梯度流/白度过宽的模型进行压缩感测的样本复杂性。理论准确地预测了若干矢量实验中的回收率。我们的证据应依据分析某位障碍。</s>

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

神经系统seipin缺失诱发精神迟滞的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

维甲酸增强骨髓间充质干细胞免疫抑制和向少突胶质细胞分化的作用及其在治疗多发性硬化的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

鸡传染性法氏囊病毒VP4和VP5蛋白抑制机体I型干扰素信号通路的探索

国家自然科学基金

0+阅读 · 2012年12月31日

Cr2AlC 211 型MAX相薄膜的合成及抗辐照损伤特性

国家自然科学基金

0+阅读 · 2012年12月31日

ESCRT体系中Vps60肽段（128-186aa）与Vta1蛋白N-端结构域复合物溶液结构确定

国家自然科学基金

0+阅读 · 2009年12月31日

颗粒物/Fenton体系的界面相互作用及机理

国家自然科学基金

0+阅读 · 2009年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

Strategic Classification with Graph Neural Networks

Arxiv

0+阅读 · 2023年5月1日

Enhancing Adversarial Contrastive Learning via Adversarial Invariant Regularization

Arxiv

0+阅读 · 2023年4月30日

A novel higher-order numerical method for parabolic integro-fractional differential equations based on wavelets and $L2$-$1_σ$ scheme

Arxiv

0+阅读 · 2023年4月29日

Explaining Image Classifiers with Multiscale Directional Image Representation

Arxiv

0+阅读 · 2023年4月28日

Visual Referential Games Further the Emergence of Disentangled Representations

Arxiv

0+阅读 · 2023年4月27日

Molecular Design Based on Integer Programming and Splitting Data Sets by Hyperplanes

Arxiv

0+阅读 · 2023年4月27日

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Arxiv

34+阅读 · 2022年6月30日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Arxiv

15+阅读 · 2021年9月6日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Multimodal Categorization of Crisis Events in Social Media

Multimodal Categorization of Crisis Events in Social Media

Arxiv

20+阅读 · 2020年4月10日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《小型无人机系统侦测追踪技术：声学、计算机视觉与深度学习融合方案》最新98页

《"牧羊人网格"拦截策略：实现无人机集群可靠拦截的新范式》

光纤无人机：反无人机系统的重大挑战

《作战建模与仿真实证研究》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

强化学习三篇论文避免遗忘等

强化学习三篇论文避免遗忘等

CreateAMind

20+阅读 · 2019年5月24日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Strategic Classification with Graph Neural Networks

Arxiv

0+阅读 · 2023年5月1日

Enhancing Adversarial Contrastive Learning via Adversarial Invariant Regularization

Arxiv

0+阅读 · 2023年4月30日

A novel higher-order numerical method for parabolic integro-fractional differential equations based on wavelets and $L2$-$1_σ$ scheme

Arxiv

0+阅读 · 2023年4月29日

Explaining Image Classifiers with Multiscale Directional Image Representation

Arxiv

0+阅读 · 2023年4月28日

Visual Referential Games Further the Emergence of Disentangled Representations

Arxiv

0+阅读 · 2023年4月27日

Molecular Design Based on Integer Programming and Splitting Data Sets by Hyperplanes

Arxiv

0+阅读 · 2023年4月27日

Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning

Arxiv

34+阅读 · 2022年6月30日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Arxiv

15+阅读 · 2021年9月6日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Multimodal Categorization of Crisis Events in Social Media

Multimodal Categorization of Crisis Events in Social Media

Arxiv

20+阅读 · 2020年4月10日

相关基金

神经系统seipin缺失诱发精神迟滞的分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

维甲酸增强骨髓间充质干细胞免疫抑制和向少突胶质细胞分化的作用及其在治疗多发性硬化的研究

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

鸡传染性法氏囊病毒VP4和VP5蛋白抑制机体I型干扰素信号通路的探索

国家自然科学基金

0+阅读 · 2012年12月31日

Cr2AlC 211 型MAX相薄膜的合成及抗辐照损伤特性

国家自然科学基金

0+阅读 · 2012年12月31日

ESCRT体系中Vps60肽段（128-186aa）与Vta1蛋白N-端结构域复合物溶液结构确定

国家自然科学基金

0+阅读 · 2009年12月31日

颗粒物/Fenton体系的界面相互作用及机理

国家自然科学基金

0+阅读 · 2009年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员