利用经典适应性过滤理论,分离批量正常化对CNN培训速度和稳定性的影响 (Separating the Effects of Batch Normalization on CNN Training Speed and Stability Using Classical Adaptive Filter Theory) - 专知论文

会员服务 ·

0

规范化的 · 批量规范化 · Neural Networks · 卷积神经网络 · 卷积 ·

2021 年 6 月 1 日

Separating the Effects of Batch Normalization on CNN Training Speed and Stability Using Classical Adaptive Filter Theory

翻译：利用经典适应性过滤理论,分离批量正常化对CNN培训速度和稳定性的影响

Elaina Chai,Mert Pilanci,Boris Murmann

from arxiv, Presented at Asilomar Conference on Signals, Systems, and Computers, 2020

Batch Normalization (BatchNorm) is commonly used in Convolutional Neural Networks (CNNs) to improve training speed and stability. However, there is still limited consensus on why this technique is effective. This paper uses concepts from the traditional adaptive filter domain to provide insight into the dynamics and inner workings of BatchNorm. First, we show that the convolution weight updates have natural modes whose stability and convergence speed are tied to the eigenvalues of the input autocorrelation matrices, which are controlled by BatchNorm through the convolution layers' channel-wise structure. Furthermore, our experiments demonstrate that the speed and stability benefits are distinct effects. At low learning rates, it is BatchNorm's amplification of the smallest eigenvalues that improves convergence speed, while at high learning rates, it is BatchNorm's suppression of the largest eigenvalues that ensures stability. Lastly, we prove that in the first training step, when normalization is needed most, BatchNorm satisfies the same optimization as Normalized Least Mean Square (NLMS), while it continues to approximate this condition in subsequent steps. The analyses provided in this paper lay the groundwork for gaining further insight into the operation of modern neural network structures using adaptive filter theory.

翻译：批量正常化( BatchNorm) 通常用于进化神经网络( BatchNorm), 以提高培训速度和稳定性。但是, 对于该技术为何有效, 仍然存在有限的共识。本文使用传统适应过滤域的概念, 以深入了解批量Norm的动态和内运行。首先, 我们显示, 批量重量更新具有自然模式, 其稳定性和趋同速度与输入自动关系矩阵的元值挂钩, 由BatchNorm通过进化层的频道结构加以控制。此外, 我们的实验表明, 速度和稳定性的好处是截然不同的。在低学习率下, 批量Norm对最小的精度值进行了放大, 提高了集速度和内运行。在高学习率下, 批量Norm对最大电子元值的抑制可以确保稳定性。最后, 我们证明, 在第一培训阶段, 由BatchNorm在最需要正常化时, 满足了与最常态最起码的平流层结构( NLMS) 相同的优化, 同时, 继续将这一理论推入后, 继续提供这一基础化的升级。

0

相关内容

规范化的

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

专知会员服务

16+阅读 · 2020年6月4日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

107+阅读 · 2020年5月15日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

40+阅读 · 2020年3月21日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

178+阅读 · 2020年2月1日

【干货】深度学习的深度思考，49页pdf，Deep Thoughts on Deep Learning

【干货】深度学习的深度思考，49页pdf，Deep Thoughts on Deep Learning

专知会员服务

28+阅读 · 2019年11月14日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

57+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

54+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

79+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

66+阅读 · 2019年10月9日

一文读懂Faster RCNN

一文读懂Faster RCNN

极市平台

5+阅读 · 2020年1月6日

Conditional Batch Normalization 详解

Conditional Batch Normalization 详解

极市平台

4+阅读 · 2019年4月12日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

15+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

Decentralized Design of Fast Iterative Receivers for Massive and Extreme-Large MIMO Systems

Arxiv

0+阅读 · 2021年7月23日

Provable tradeoffs in adversarially robust classification

Arxiv

0+阅读 · 2021年7月22日

Differentiable Annealed Importance Sampling and the Perils of Gradient Noise

Arxiv

0+阅读 · 2021年7月21日

Conjugate Beamforming with Fractional-Exponent Normalization and Scalable Power Control in Cell-Free Massive MIMO

Arxiv

0+阅读 · 2021年7月20日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

15+阅读 · 2020年3月30日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

Group Normalization

Arxiv

7+阅读 · 2018年3月22日

VIP会员

文章信息

相关主题

批量规范化

Neural Networks

卷积神经网络

相关VIP内容

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

【ICML2020】序数非负矩阵分解推荐，On the Number of Linear Regions of Convolutional Neural Networks

专知会员服务

16+阅读 · 2020年6月4日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

107+阅读 · 2020年5月15日

【伯克利】再思考 Transformer中的Batch Normalization

【伯克利】再思考 Transformer中的Batch Normalization

专知会员服务

40+阅读 · 2020年3月21日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

178+阅读 · 2020年2月1日

【干货】深度学习的深度思考，49页pdf，Deep Thoughts on Deep Learning

【干货】深度学习的深度思考，49页pdf，Deep Thoughts on Deep Learning

专知会员服务

28+阅读 · 2019年11月14日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

57+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

54+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

79+阅读 · 2019年10月9日

TensorFlow 2.0 学习资源汇总

TensorFlow 2.0 学习资源汇总

专知会员服务

66+阅读 · 2019年10月9日

热门VIP内容

相关资讯

一文读懂Faster RCNN

一文读懂Faster RCNN

极市平台

5+阅读 · 2020年1月6日

Conditional Batch Normalization 详解

Conditional Batch Normalization 详解

极市平台

4+阅读 · 2019年4月12日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

15+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

条件GAN重大改进！cGANs with Projection Discriminator

条件GAN重大改进！cGANs with Projection Discriminator

CreateAMind

8+阅读 · 2018年2月7日

可解释的CNN

可解释的CNN

CreateAMind

17+阅读 · 2017年10月5日

【推荐】TensorFlow手把手CNN实践指南

【推荐】TensorFlow手把手CNN实践指南

机器学习研究会

5+阅读 · 2017年8月17日

相关论文

Decentralized Design of Fast Iterative Receivers for Massive and Extreme-Large MIMO Systems

Arxiv

0+阅读 · 2021年7月23日

Provable tradeoffs in adversarially robust classification

Arxiv

0+阅读 · 2021年7月22日

Differentiable Annealed Importance Sampling and the Perils of Gradient Noise

Arxiv

0+阅读 · 2021年7月21日

Conjugate Beamforming with Fractional-Exponent Normalization and Scalable Power Control in Cell-Free Massive MIMO

Arxiv

0+阅读 · 2021年7月20日

Adversarial and Contrastive Variational Autoencoder for Sequential Recommendation

Arxiv

17+阅读 · 2021年3月19日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

L^2-GCN: Layer-Wise and Learned Efficient Training of Graph Convolutional Networks

Arxiv

15+阅读 · 2020年3月30日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

A Dual Approach to Scalable Verification of Deep Networks

A Dual Approach to Scalable Verification of Deep Networks

Arxiv

3+阅读 · 2018年8月3日

Group Normalization

Arxiv

7+阅读 · 2018年3月22日

微信扫码咨询专知VIP会员