深度神经网络中权重矩阵的重尾正则化 (Heavy-Tailed Regularization of Weight Matrices in Deep Neural Networks) - 专知论文

会员服务 ·

0

正则化 · 深度神经网络 · 随机矩阵 · 神经网络 · 泛化 ·

2023 年 4 月 6 日

Heavy-Tailed Regularization of Weight Matrices in Deep Neural Networks

翻译：深度神经网络中权重矩阵的重尾正则化

Xuanzhe Xiao,Zeng Li,Chuanlong Xie,Fengwei Zhou

Unraveling the reasons behind the remarkable success and exceptional generalization capabilities of deep neural networks presents a formidable challenge. Recent insights from random matrix theory, specifically those concerning the spectral analysis of weight matrices in deep neural networks, offer valuable clues to address this issue. A key finding indicates that the generalization performance of a neural network is associated with the degree of heavy tails in the spectrum of its weight matrices. To capitalize on this discovery, we introduce a novel regularization technique, termed Heavy-Tailed Regularization, which explicitly promotes a more heavy-tailed spectrum in the weight matrix through regularization. Firstly, we employ the Weighted Alpha and Stable Rank as penalty terms, both of which are differentiable, enabling the direct calculation of their gradients. To circumvent over-regularization, we introduce two variations of the penalty function. Then, adopting a Bayesian statistics perspective and leveraging knowledge from random matrices, we develop two novel heavy-tailed regularization methods, utilizing Powerlaw distribution and Frechet distribution as priors for the global spectrum and maximum eigenvalues, respectively. We empirically show that heavytailed regularization outperforms conventional regularization techniques in terms of generalization performance.

翻译：揭示深度神经网络在卓越的泛化能力和成功方面背后的原因是具有巨大挑战的。最近，来自随机矩阵理论的见解，特别是有关深度神经网络中权重矩阵的谱分析的见解，为解决该问题提供了宝贵线索。一个关键发现表明，神经网络的泛化性能与其重尾谱的程度相关。为了利用这一发现，我们引入了一种新颖的正则化技术，称为重尾正则化，通过正则化明确促进了权重矩阵中更重尾谱的形成。首先，我们使用加权阿尔法和稳定秩作为惩罚项，两者都是可微的，可以直接计算它们的梯度。为了规避过多正则化，我们引入了两种惩罚函数的变化。然后，采用贝叶斯统计学的观点，并利用随机矩阵的知识，我们开发了两种新的重尾正则化方法，利用幂律分布和Fechet分布作为全局谱和最大特征值的先验，分别。我们在实验中表明，与传统的正则化技术相比，重尾正则化在泛化性能方面表现更好。

0

相关内容

正则化

在数学，统计学和计算机科学中，尤其是在机器学习和逆问题中，正则化是添加信息以解决不适定问题或防止过度拟合的过程。正则化适用于不适定的优化问题中的目标函数。

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

深度学习激活函数全面综述论文

深度学习激活函数全面综述论文

专知会员服务

72+阅读 · 2021年10月1日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

专知会员服务

17+阅读 · 2020年6月4日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

专知会员服务

53+阅读 · 2020年4月8日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

贝叶斯网络在医疗的应用综述：过去，现在和未来 | A Comprehensive Scoping Review of Bayesian Networks in Healthcare: Past, Present and Future

贝叶斯网络在医疗的应用综述：过去，现在和未来 | A Comprehensive Scoping Review of Bayesian Networks in Healthcare: Past, Present and Future

专知会员服务

41+阅读 · 2020年2月26日

【剑桥大学ICLR2020】卷积条件神经过程，Convolutional Conditional Neural Processes

【剑桥大学ICLR2020】卷积条件神经过程，Convolutional Conditional Neural Processes

专知会员服务

33+阅读 · 2020年1月19日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

PaperWeekly

0+阅读 · 2022年5月25日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

专知

15+阅读 · 2018年6月15日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

高维积分波动率矩阵的估计及其在资产投资中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

集值优化问题的逼近解及二阶最优性条件

国家自然科学基金

0+阅读 · 2014年12月31日

多参数传热反问题的RBF-MLPG方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

三维椭圆方程Cauchy问题的正则化方法

国家自然科学基金

0+阅读 · 2013年12月31日

神经网络随机学习算法的泛化性研究

国家自然科学基金

2+阅读 · 2013年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

0+阅读 · 2011年12月31日

有限需求信息下收费道路的定价、投资决策及效率损失评估研究

国家自然科学基金

0+阅读 · 2009年12月31日

PI3K/AKT通路介导FAS与HER2相互作用调控大肠癌恶性表型及其分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

高复杂度矩阵计算问题及其应用

国家自然科学基金

0+阅读 · 2008年12月31日

VanillaNet: the Power of Minimalism in Deep Learning

Arxiv

0+阅读 · 2023年5月23日

Detection of Interacting Variables for Generalized Linear Models via Neural Networks

Arxiv

0+阅读 · 2023年5月21日

Categorical Foundations of Explainable AI

Arxiv

1+阅读 · 2023年5月19日

Curve Your Enthusiasm: Concurvity Regularization in Differentiable Generalized Additive Models

Arxiv

0+阅读 · 2023年5月19日

Brain-inspired learning in artificial neural networks: a review

Arxiv

0+阅读 · 2023年5月18日

Deep Long-Tailed Learning: A Survey

Arxiv

13+阅读 · 2021年10月9日

Self-Supervised Learning of Graph Neural Networks: A Unified Review

Arxiv

38+阅读 · 2021年2月23日

Bayesian Deep Learning via Subnetwork Inference

Arxiv

10+阅读 · 2021年2月18日

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Arxiv

11+阅读 · 2019年9月19日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

VIP会员

文章信息

相关主题

深度神经网络

相关VIP内容

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

【干货书】深度学习合成数据，354页pdf，Synthetic Data for Deep Learning

专知会员服务

104+阅读 · 2022年2月10日

深度学习激活函数全面综述论文

深度学习激活函数全面综述论文

专知会员服务

72+阅读 · 2021年10月1日

【ETH】最新《几何数据分析》2020课程，附PPT下载

专知会员服务

44+阅读 · 2020年12月18日

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

专知会员服务

17+阅读 · 2020年6月4日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

专知会员服务

53+阅读 · 2020年4月8日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

贝叶斯网络在医疗的应用综述：过去，现在和未来 | A Comprehensive Scoping Review of Bayesian Networks in Healthcare: Past, Present and Future

贝叶斯网络在医疗的应用综述：过去，现在和未来 | A Comprehensive Scoping Review of Bayesian Networks in Healthcare: Past, Present and Future

专知会员服务

41+阅读 · 2020年2月26日

【剑桥大学ICLR2020】卷积条件神经过程，Convolutional Conditional Neural Processes

【剑桥大学ICLR2020】卷积条件神经过程，Convolutional Conditional Neural Processes

专知会员服务

33+阅读 · 2020年1月19日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

卫星导航技术发展综述

《美军"僚机"联合能力技术演示项目：有人-无人火炮作战》41页报告

美军条令《火力指挥》116页

可解释的人工智能在生物医学图像分析中的应用综述

相关资讯

浅聊对比学习（Contrastive Learning）

浅聊对比学习（Contrastive Learning）

极市平台

2+阅读 · 2022年7月26日

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

全面讨论泛化 (generalization) 和正则化 (regularization) — Part 1

PaperWeekly

0+阅读 · 2022年5月25日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

灾难性遗忘问题新视角：迁移-干扰平衡

灾难性遗忘问题新视角：迁移-干扰平衡

CreateAMind

17+阅读 · 2019年7月6日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

【论文推荐】最新八篇推荐系统相关论文—亿级商品嵌入、主动学习、树深度模型、知识图谱、注意力感知、矩阵分解、神经个性化嵌入

专知

15+阅读 · 2018年6月15日

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

【论文推荐】最新6篇生成式对抗网络（GAN）相关论文—半监督对抗学习、行人再识别、代表性特征、高分辨率深度卷积、自监督、超分辨

专知

10+阅读 · 2018年2月1日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

【推荐】图像分类必读开创性论文汇总

【推荐】图像分类必读开创性论文汇总

机器学习研究会

14+阅读 · 2017年8月15日

相关论文

VanillaNet: the Power of Minimalism in Deep Learning

Arxiv

0+阅读 · 2023年5月23日

Detection of Interacting Variables for Generalized Linear Models via Neural Networks

Arxiv

0+阅读 · 2023年5月21日

Categorical Foundations of Explainable AI

Arxiv

1+阅读 · 2023年5月19日

Curve Your Enthusiasm: Concurvity Regularization in Differentiable Generalized Additive Models

Arxiv

0+阅读 · 2023年5月19日

Brain-inspired learning in artificial neural networks: a review

Arxiv

0+阅读 · 2023年5月18日

Deep Long-Tailed Learning: A Survey

Arxiv

13+阅读 · 2021年10月9日

Self-Supervised Learning of Graph Neural Networks: A Unified Review

Arxiv

38+阅读 · 2021年2月23日

Bayesian Deep Learning via Subnetwork Inference

Arxiv

10+阅读 · 2021年2月18日

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Arxiv

11+阅读 · 2019年9月19日

Continual Lifelong Learning with Neural Networks: A Review

Arxiv

14+阅读 · 2019年2月11日

相关基金

高维积分波动率矩阵的估计及其在资产投资中的应用

国家自然科学基金

0+阅读 · 2015年12月31日

集值优化问题的逼近解及二阶最优性条件

国家自然科学基金

0+阅读 · 2014年12月31日

多参数传热反问题的RBF-MLPG方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

三维椭圆方程Cauchy问题的正则化方法

国家自然科学基金

0+阅读 · 2013年12月31日

神经网络随机学习算法的泛化性研究

国家自然科学基金

2+阅读 · 2013年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

广义Kloosterman和的均值估计

国家自然科学基金

0+阅读 · 2011年12月31日

有限需求信息下收费道路的定价、投资决策及效率损失评估研究

国家自然科学基金

0+阅读 · 2009年12月31日

PI3K/AKT通路介导FAS与HER2相互作用调控大肠癌恶性表型及其分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

高复杂度矩阵计算问题及其应用

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员