深层学习逐步规范化:高效率的有限差异计算和隐含偏见 (Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias) - 专知论文

会员服务 ·

0

正则化项 · 可理解性 · Learning · 有偏 · 极小值 ·

2022 年 10 月 6 日

Understanding Gradient Regularization in Deep Learning: Efficient Finite-Difference Computation and Implicit Bias

翻译：深层学习逐步规范化:高效率的有限差异计算和隐含偏见

Ryo Karakida,Tomoumi Takase,Tomohiro Hayase,Kazuki Osawa

Gradient regularization (GR) is a method that penalizes the gradient norm of the training loss during training. Although some studies have reported that GR improves generalization performance in deep learning, little attention has been paid to it from the algorithmic perspective, that is, the algorithms of GR that efficiently improve performance. In this study, we first reveal that a specific finite-difference computation, composed of both gradient ascent and descent steps, reduces the computational cost for GR. In addition, this computation empirically achieves better generalization performance. Next, we theoretically analyze a solvable model, a diagonal linear network, and clarify that GR has a desirable implicit bias in a certain problem. In particular, learning with the finite-difference GR chooses better minima as the ascent step size becomes larger. Finally, we demonstrate that finite-difference GR is closely related to some other algorithms based on iterative ascent and descent steps for exploring flat minima: sharpness-aware minimization and the flooding method. We reveal that flooding performs finite-difference GR in an implicit way. Thus, this work broadens our understanding of GR in both practice and theory.

翻译：虽然一些研究报告说,GR改进了深层学习的概括性表现,但从算法角度,即有效提高业绩的GR算法来看,它很少受到重视。在本研究中,我们首先发现,由梯度升降和降级步骤组成的特定定点差异计算降低了GR的计算成本。此外,这种计算还取得了更好的概括性表现。接下来,我们从理论上分析一种可溶模型,一种对角线性网络,并澄清GR在某些问题上有可取的隐含偏差。特别是,与有限偏差GR的学习选择了更好的微米,因为最近步骤的大小越来越大。最后,我们证明,定点差异GR与一些基于迭代性升和降级步骤探索平坦微型:敏锐度最小化和洪涝方法的其他算法密切相关。我们揭示,洪水以隐含的方式使GR具有一定的距离。因此,这项工作扩大了我们对GR的理解。

0

相关内容

正则化项

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

新的小分子化合物WJ460通过靶向Myoferlin抑制乳腺癌转移和复发的分子机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Iotrochota属海绵及其内生菌中含氮化合物及其抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

海洋贝类对纳米银的吸收及其生物效应

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

糖皮质激素抵抗介导的小胶质细胞过度激活在脑外伤后CIRCI发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

我国碳减排的区域经济效应及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Debiasing Methods for Fairer Neural Models in Vision and Language Research: A Survey

Arxiv

0+阅读 · 2022年11月10日

On the Versatile Uses of Partial Distance Correlation in Deep Learning

Arxiv

0+阅读 · 2022年11月8日

Understanding the Evolution of Linear Regions in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年11月7日

Lower Bounds for the Convergence of Tensor Power Iteration on Random Overcomplete Models

Arxiv

0+阅读 · 2022年11月7日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Arxiv

10+阅读 · 2022年2月10日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

59+阅读 · 2020年1月25日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【牛津博士论文】零样本强化学习综述

《美军条令：陆军指挥官与规划人员地理空间指南》60页

战术边缘指挥控制：防务面临的核心挑战

迈向开放世界检测：综述

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

【ICIG2021】Latest News & Announcements of the Plenary Talk1

【ICIG2021】Latest News & Announcements of the Plenary Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年11月1日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Debiasing Methods for Fairer Neural Models in Vision and Language Research: A Survey

Arxiv

0+阅读 · 2022年11月10日

On the Versatile Uses of Partial Distance Correlation in Deep Learning

Arxiv

0+阅读 · 2022年11月8日

Understanding the Evolution of Linear Regions in Deep Reinforcement Learning

Arxiv

0+阅读 · 2022年11月7日

Lower Bounds for the Convergence of Tensor Power Iteration on Random Overcomplete Models

Arxiv

0+阅读 · 2022年11月7日

Learning with Differentiable Algorithms

Arxiv

11+阅读 · 2022年9月1日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Arxiv

10+阅读 · 2022年2月10日

K-AID: Enhancing Pre-trained Language Models with Domain Knowledge for Question Answering

Arxiv

15+阅读 · 2021年9月22日

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better

Arxiv

28+阅读 · 2021年6月16日

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

Arxiv

14+阅读 · 2021年1月31日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Arxiv

12+阅读 · 2020年6月23日

相关基金

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

新的小分子化合物WJ460通过靶向Myoferlin抑制乳腺癌转移和复发的分子机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Iotrochota属海绵及其内生菌中含氮化合物及其抗肿瘤活性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

海洋贝类对纳米银的吸收及其生物效应

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

糖皮质激素抵抗介导的小胶质细胞过度激活在脑外伤后CIRCI发病中的作用机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

我国碳减排的区域经济效应及机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Legumain在乳腺癌骨转移和破骨损伤过程中的作用机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员