在RELU网络中实现最低程度最低程度的隐性规则化 (Implicit Regularization Towards Rank Minimization in ReLU Networks) - 专知论文

会员服务 ·

0

ReLU · 秩 · 正则化项 · Networking · 线性的 ·

2022 年 1 月 30 日

Implicit Regularization Towards Rank Minimization in ReLU Networks

翻译：在RELU网络中实现最低程度最低程度的隐性规则化

Nadav Timor,Gal Vardi,Ohad Shamir

We study the conjectured relationship between the implicit regularization in neural networks, trained with gradient-based methods, and rank minimization of their weight matrices. Previously, it was proved that for linear networks (of depth 2 and vector-valued outputs), gradient flow (GF) w.r.t. the square loss acts as a rank minimization heuristic. However, understanding to what extent this generalizes to nonlinear networks is an open problem. In this paper, we focus on nonlinear ReLU networks, providing several new positive and negative results. On the negative side, we prove (and demonstrate empirically) that, unlike the linear case, GF on ReLU networks may no longer tend to minimize ranks, in a rather strong sense (even approximately, for "most" datasets of size 2). On the positive side, we reveal that ReLU networks of sufficient depth are provably biased towards low-rank solutions in several reasonable settings.

翻译：我们研究了神经网络内隐含的正规化、受过梯度方法培训的神经网络内隐含的正规化和减低其重量矩阵的等级之间的推测关系。以前,我们曾证明,对于线性网络(深度2和矢量值产出),平方损失的梯度流(GF)相当于排位最低的超常性。然而,了解这种对非线性网络的概括程度是一个尚未解决的问题。在本文中,我们侧重于非线性ReLU网络,提供了若干新的正反结果。在负面方面,我们证明(并用经验证明),与线性案例不同,雷武网络上的GF可能不再倾向于以相当强烈的意义上(甚至大致上,“最”大小的数据组2 ) 尽量减少排位。在正面方面,我们发现,具有足够深度的RELU网络在若干合理环境下倾向于低层次的解决方案。

0

相关内容

ReLU

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

肺炎链球菌疫苗SPY1的一种免疫保护机制：TGF-β信号通路介导Treg细胞参与保护性免疫

国家自然科学基金

0+阅读 · 2015年12月31日

高斯序列与过程的极值理论

国家自然科学基金

2+阅读 · 2015年12月31日

保险中两类随机最优控制问题及策略过程概率分布研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向超多目标优化的分解进化算法

国家自然科学基金

0+阅读 · 2014年12月31日

连续时间马氏决策过程均值-方差优化问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

BAFF干扰的树突状细胞参与自身免疫性关节炎免疫耐受的作用和机制

国家自然科学基金

0+阅读 · 2012年12月31日

面向图像分割的自适应脉冲耦合神经网络理论及应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

类年龄结构与免疫-传染病耦合系统建模与研究

国家自然科学基金

1+阅读 · 2012年12月31日

Sobolev空间的多小波采样定理及相关问题研究

国家自然科学基金

0+阅读 · 2011年12月31日

ROS信号通路与AsA参与棉花纤维细胞发育的作用机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

Real-Time Segmentation Networks should be Latency Aware

Arxiv

0+阅读 · 2022年4月20日

A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization

Arxiv

0+阅读 · 2022年4月19日

Rank-Constrained Least-Squares: Prediction and Inference

Arxiv

0+阅读 · 2022年4月17日

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

Arxiv

0+阅读 · 2022年4月16日

Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Arxiv

0+阅读 · 2022年4月14日

Alternating Mahalanobis Distance Minimization for Stable and Accurate CP Decomposition

Arxiv

0+阅读 · 2022年4月14日

Iterative Hard Thresholding with Adaptive Regularization: Sparser Solutions Without Sacrificing Runtime

Arxiv

0+阅读 · 2022年4月11日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

Improved Image Segmentation via Cost Minimization of Multiple Hypotheses

Arxiv

14+阅读 · 2018年1月31日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

ICLR 2021杰出论文奖出炉，8篇论文上榜！

专知会员服务

26+阅读 · 2021年4月2日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

【ICLR2020】深度神经网络优化轨迹的平衡点，The Break-Even Point on Optimization Trajectories of Deep Neural Networks

专知会员服务

34+阅读 · 2020年2月27日

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

【北京智源大会2019】神经网络的优化Optimization for Overparametrized Deep Neural Networks，北京大学 | 王立威

专知会员服务

23+阅读 · 2019年11月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

12+阅读 · 2019年5月6日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Real-Time Segmentation Networks should be Latency Aware

Arxiv

0+阅读 · 2022年4月20日

A Novel Fast Exact Subproblem Solver for Stochastic Quasi-Newton Cubic Regularized Optimization

Arxiv

0+阅读 · 2022年4月19日

Rank-Constrained Least-Squares: Prediction and Inference

Arxiv

0+阅读 · 2022年4月17日

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

Arxiv

0+阅读 · 2022年4月16日

Convergence and Implicit Regularization Properties of Gradient Descent for Deep Residual Networks

Arxiv

0+阅读 · 2022年4月14日

Alternating Mahalanobis Distance Minimization for Stable and Accurate CP Decomposition

Arxiv

0+阅读 · 2022年4月14日

Iterative Hard Thresholding with Adaptive Regularization: Sparser Solutions Without Sacrificing Runtime

Arxiv

0+阅读 · 2022年4月11日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Arxiv

20+阅读 · 2021年5月10日

CNN+CNN: Convolutional Decoders for Image Captioning

Arxiv

21+阅读 · 2018年5月23日

Improved Image Segmentation via Cost Minimization of Multiple Hypotheses

Arxiv

14+阅读 · 2018年1月31日

相关基金

肺炎链球菌疫苗SPY1的一种免疫保护机制：TGF-β信号通路介导Treg细胞参与保护性免疫

国家自然科学基金

0+阅读 · 2015年12月31日

高斯序列与过程的极值理论

国家自然科学基金

2+阅读 · 2015年12月31日

保险中两类随机最优控制问题及策略过程概率分布研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向超多目标优化的分解进化算法

国家自然科学基金

0+阅读 · 2014年12月31日

连续时间马氏决策过程均值-方差优化问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

BAFF干扰的树突状细胞参与自身免疫性关节炎免疫耐受的作用和机制

国家自然科学基金

0+阅读 · 2012年12月31日

面向图像分割的自适应脉冲耦合神经网络理论及应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

类年龄结构与免疫-传染病耦合系统建模与研究

国家自然科学基金

1+阅读 · 2012年12月31日

Sobolev空间的多小波采样定理及相关问题研究

国家自然科学基金

0+阅读 · 2011年12月31日

ROS信号通路与AsA参与棉花纤维细胞发育的作用机制研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员