动力残余神经神经网络 (Momentum Residual Neural Networks) - 专知论文

会员服务 ·

0

动量 · Neural Networks · Networking · 求逆 · ResNet ·

2021 年 2 月 15 日

Momentum Residual Neural Networks

翻译：动力残余神经神经网络

Michael E. Sander,Pierre Ablin,Mathieu Blondel,Gabriel Peyré

from arxiv, 34 pages

The training of deep residual neural networks (ResNets) with backpropagation has a memory cost that increases linearly with respect to the depth of the network. A simple way to circumvent this issue is to use reversible architectures. In this paper, we propose to change the forward rule of a ResNet by adding a momentum term. The resulting networks, momentum residual neural networks (MomentumNets), are invertible. Unlike previous invertible architectures, they can be used as a drop-in replacement for any existing ResNet block. We show that MomentumNets can be interpreted in the infinitesimal step size regime as second-order ordinary differential equations (ODEs) and exactly characterize how adding momentum progressively increases the representation capabilities of MomentumNets. Our analysis reveals that MomentumNets can learn any linear mapping up to a multiplicative factor, while ResNets cannot. In a learning to optimize setting, where convergence to a fixed point is required, we show theoretically and empirically that our method succeeds while existing invertible architectures fail. We show on CIFAR and ImageNet that MomentumNets have the same accuracy as ResNets, while having a much smaller memory footprint, and show that pre-trained MomentumNets are promising for fine-tuning models.

翻译：深残余神经网络(ResNets)的后向反向分析培训的记忆成本随着网络深度的深度而直线增长。绕过这一问题的一个简单方法就是使用可逆结构。在本文中, 我们提议通过增加一个动因词来改变ResNet的前瞻性规则。由此形成的网络、动力残留神经网络( MomentumNets) 无法倒置。与以往不可逆的结构不同, 它们可以用作任何现有的ResNet块的下降替代物。我们显示, MomentumNet可以在无限的步进规模制度中被解释为二阶普通差异方程式( ODEs), 并准确地描述如何增加动力, 逐步提高 MomentumNet 的代表能力。我们的分析显示, MomentumNet 可以学习任何直线性绘图, 达到多复制性因素, 而ResNets则无法。在学习优化设置时, 需要与固定点相融合, 我们从理论上和实验上显示, 我们的方法在现有的可倒置结构失败时会成功。我们在 CIFAR 和图像网络上展示, 我们展示了更小的模型和图像网络是移动式的模型,, 并且显示, 模型显示, 最有预的缩的缩的缩缩缩缩的记忆网络, 。

0

相关内容

动量方法 (Polyak, 1964) 旨在加速学习，特别是处理高曲率、小但一致的梯度，或是带噪声的梯度。动量算法积累了之前梯度指数级衰减的移动平均，并且继续沿该方向移动。

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

专知会员服务

53+阅读 · 2020年4月8日

【CVPR2020-牛津大学】具有自适应邻域一致性的通信网络，Correspondence Networks with Adaptive Neighbourhood Consensus

【CVPR2020-牛津大学】具有自适应邻域一致性的通信网络，Correspondence Networks with Adaptive Neighbourhood Consensus

专知会员服务

16+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICLR-2020】网络反卷积，NETWORK DECONVOLUTION

【ICLR-2020】网络反卷积，NETWORK DECONVOLUTION

专知会员服务

39+阅读 · 2020年2月21日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

专知会员服务

14+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

《科学》（20190426出版）一周论文导读

《科学》（20190426出版）一周论文导读

科学网

5+阅读 · 2019年4月27日

已删除

将门创投

11+阅读 · 2019年4月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

10分钟搞懂反向传播| Neural Networks #13

10分钟搞懂反向传播| Neural Networks #13

AI研习社

3+阅读 · 2018年1月7日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Highway Networks For Sentence Classification

Highway Networks For Sentence Classification

哈工大SCIR

4+阅读 · 2017年9月30日

Kernel Based Progressive Distillation for Adder Neural Networks

Arxiv

5+阅读 · 2020年9月29日

Graph Neural Networks: Architectures, Stability and Transferability

Arxiv

13+阅读 · 2020年8月4日

Subgraph Neural Networks

Arxiv

27+阅读 · 2020年6月19日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation Learning

Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation Learning

Arxiv

3+阅读 · 2019年8月26日

Residual Non-local Attention Networks for Image Restoration

Arxiv

9+阅读 · 2019年3月24日

Residual Policy Learning

Residual Policy Learning

Arxiv

4+阅读 · 2018年12月15日

How Powerful are Graph Neural Networks?

Arxiv

23+阅读 · 2018年10月1日

Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation

Arxiv

9+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

【ICML2020】深度神经网络置信感知学习，Conﬁdence-Aware Learning for Deep Neural Networks

专知会员服务

74+阅读 · 2020年7月6日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

【论文推荐】二值神经网络综述，Binary Neural Networks: A Survey

专知会员服务

53+阅读 · 2020年4月8日

【CVPR2020-牛津大学】具有自适应邻域一致性的通信网络，Correspondence Networks with Adaptive Neighbourhood Consensus

【CVPR2020-牛津大学】具有自适应邻域一致性的通信网络，Correspondence Networks with Adaptive Neighbourhood Consensus

专知会员服务

16+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【ICLR-2020】网络反卷积，NETWORK DECONVOLUTION

【ICLR-2020】网络反卷积，NETWORK DECONVOLUTION

专知会员服务

39+阅读 · 2020年2月21日

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

【牛津大学】深度残差强化学习，Deep Residual Reinforcement Learning

专知会员服务

84+阅读 · 2020年2月18日

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

【论文】深度卷积神经网络的ImageNet分类（ImageNet Classification with Deep Convolutional Neural Networks）

专知会员服务

14+阅读 · 2020年1月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《毁灭算法：解析以色列在加沙的AI军事行动》

【COLT 2025最新教程】语言生成

以机器速度锁定目标：人工智能的能力与局限

【ICML2025】通过在线世界模型规划的持续强化学习

相关资讯

《科学》（20190426出版）一周论文导读

《科学》（20190426出版）一周论文导读

科学网

5+阅读 · 2019年4月27日

已删除

将门创投

11+阅读 · 2019年4月26日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

【NIPS2018】接收论文列表

【NIPS2018】接收论文列表

专知

5+阅读 · 2018年9月10日

10分钟搞懂反向传播| Neural Networks #13

10分钟搞懂反向传播| Neural Networks #13

AI研习社

3+阅读 · 2018年1月7日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

19+阅读 · 2017年12月17日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

Highway Networks For Sentence Classification

Highway Networks For Sentence Classification

哈工大SCIR

4+阅读 · 2017年9月30日

相关论文

Kernel Based Progressive Distillation for Adder Neural Networks

Arxiv

5+阅读 · 2020年9月29日

Graph Neural Networks: Architectures, Stability and Transferability

Arxiv

13+阅读 · 2020年8月4日

Subgraph Neural Networks

Arxiv

27+阅读 · 2020年6月19日

On Layer Normalization in the Transformer Architecture

Arxiv

4+阅读 · 2020年2月12日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

66+阅读 · 2019年9月8日

Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation Learning

Residual or Gate? Towards Deeper Graph Neural Networks for Inductive Graph Representation Learning

Arxiv

3+阅读 · 2019年8月26日

Residual Non-local Attention Networks for Image Restoration

Arxiv

9+阅读 · 2019年3月24日

Residual Policy Learning

Residual Policy Learning

Arxiv

4+阅读 · 2018年12月15日

How Powerful are Graph Neural Networks?

Arxiv

23+阅读 · 2018年10月1日

Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation

Arxiv

9+阅读 · 2018年1月16日

微信扫码咨询专知VIP会员