深海残余网络批次正常化赌博正规化准则 (Guidelines for the Regularization of Gammas in Batch Normalization for Deep Residual Networks) - 专知论文

会员服务 ·

0

正则化项 · 残差网络 · 批量规范化 · Networking · 规范化的 ·

2022 年 5 月 15 日

Guidelines for the Regularization of Gammas in Batch Normalization for Deep Residual Networks

翻译：深海残余网络批次正常化赌博正规化准则

Bum Jun Kim,Hyeyeon Choi,Hyeonah Jang,Dong Gu Lee,Wonseok Jeong,Sang Woo Kim

from arxiv, 12 pages, 6 figures

L2 regularization for weights in neural networks is widely used as a standard training trick. However, L2 regularization for gamma, a trainable parameter of batch normalization, remains an undiscussed mystery and is applied in different ways depending on the library and practitioner. In this paper, we study whether L2 regularization for gamma is valid. To explore this issue, we consider two approaches: 1) variance control to make the residual network behave like identity mapping and 2) stable optimization through the improvement of effective learning rate. Through two analyses, we specify the desirable and undesirable gamma to apply L2 regularization and propose four guidelines for managing them. In several experiments, we observed the increase and decrease in performance caused by applying L2 regularization to gamma of four categories, which is consistent with our four guidelines. Our proposed guidelines were validated through various tasks and architectures, including variants of residual networks and transformers.

翻译：神经网络重量的L2正规化被广泛用作一种标准的训练技巧,然而,作为可培训的批量正常化参数,伽马语的L2正规化仍然是个未讨论的神秘问题,其应用方式取决于图书馆和从业人员。我们在本文件中研究伽马语的L2正规化是否有效。为了探讨这一问题,我们考虑采取两种办法:(1) 差异控制,使剩余网络的行为像身份绘图一样,(2) 通过提高有效学习率实现稳定的优化。我们通过两个分析,明确了适用L2正规化的可取性和不可取的伽马语,并提出了管理这些特性的四项准则。我们在若干实验中注意到,对四种类别伽马语适用L2正规化,这符合我们的四项准则。我们提出的准则是通过各种任务和结构,包括残余网络和变异器的变异体,得到验证的。

0

相关内容

正则化项

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

P(VDF-TrFE)/BaTiO3纳米纤维复合材料的制备及其成骨性能调控研究

国家自然科学基金

0+阅读 · 2015年12月31日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

熔盐传热蓄热材料高温热力学与反应动力学机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀土复合添加提高镍基高温合金抗氧化性能的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Perp在类风湿性关节炎外周Th17细胞存活中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

介尺度磁性复合囊泡状结构材料的可控构筑及性能

国家自然科学基金

0+阅读 · 2012年12月31日

茉莉酸诱导相关AP2/EREBP转录因子调控橡胶草产胶机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

<001>取向偏离度对单晶高温合金热机械疲劳性能的影响及其机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

asporin在椎间盘退变中的作用及机理的研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Class Impression for Data-free Incremental Learning

Arxiv

0+阅读 · 2022年7月4日

Stability Approach to Regularization Selection for Reduced-Rank Regression

Arxiv

0+阅读 · 2022年7月3日

On Measuring Excess Capacity in Neural Networks

Arxiv

0+阅读 · 2022年6月29日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Model Complexity of Deep Learning: A Survey

Arxiv

32+阅读 · 2021年3月8日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

VIP会员

文章信息

相关主题

批量规范化

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

【深度学习架构、模型和技巧集合(TensorFlow/PyTorch)】’Deep Learning Models - A collection of various deep learning architectures, models, and tips'

专知会员服务

58+阅读 · 2020年1月25日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

相关论文

Class Impression for Data-free Incremental Learning

Arxiv

0+阅读 · 2022年7月4日

Stability Approach to Regularization Selection for Reduced-Rank Regression

Arxiv

0+阅读 · 2022年7月3日

On Measuring Excess Capacity in Neural Networks

Arxiv

0+阅读 · 2022年6月29日

Bayesian Deep Learning for Graphs

Arxiv

23+阅读 · 2022年2月24日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Model Complexity of Deep Learning: A Survey

Arxiv

32+阅读 · 2021年3月8日

The Causal Learning of Retail Delinquency

Arxiv

14+阅读 · 2020年12月17日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Arxiv

37+阅读 · 2020年10月9日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Deep learning for time series classification: a review

Arxiv

12+阅读 · 2019年3月14日

相关基金

P(VDF-TrFE)/BaTiO3纳米纤维复合材料的制备及其成骨性能调控研究

国家自然科学基金

0+阅读 · 2015年12月31日

Resveratrol联合MSCs移植对阿尔茨海默鼠的干预效果及Sirt1分子信号的介导作用

国家自然科学基金

0+阅读 · 2014年12月31日

熔盐传热蓄热材料高温热力学与反应动力学机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

稀土复合添加提高镍基高温合金抗氧化性能的机理研究

国家自然科学基金

0+阅读 · 2013年12月31日

Perp在类风湿性关节炎外周Th17细胞存活中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

介尺度磁性复合囊泡状结构材料的可控构筑及性能

国家自然科学基金

0+阅读 · 2012年12月31日

茉莉酸诱导相关AP2/EREBP转录因子调控橡胶草产胶机理的研究

国家自然科学基金

0+阅读 · 2012年12月31日

<001>取向偏离度对单晶高温合金热机械疲劳性能的影响及其机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

asporin在椎间盘退变中的作用及机理的研究

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员