将经常性神经网络变暖,以最大限度地实现可实现的多稳定,大大改善学习 (Warming-up recurrent neural networks to maximize reachable multi-stability greatly improves learning) - 专知论文

会员服务 ·

0

循环神经网络 · Networking · Neural Networks · Performer · 循环网络 ·

2021 年 6 月 2 日

Warming-up recurrent neural networks to maximize reachable multi-stability greatly improves learning

翻译：将经常性神经网络变暖,以最大限度地实现可实现的多稳定,大大改善学习

Nicolas Vecoven,Damien Ernst,Guillaume Drion

Training recurrent neural networks is known to be difficult when time dependencies become long. Consequently, training standard gated cells such as gated recurrent units and long-short term memory on benchmarks where long-term memory is required remains an arduous task. In this work, we propose a general way to initialize any recurrent network connectivity through a process called "warm-up" to improve its capability to learn arbitrarily long time dependencies. This initialization process is designed to maximize network reachable multi-stability, i.e. the number of attractors within the network that can be reached through relevant input trajectories. Warming-up is performed before training, using stochastic gradient descent on a specifically designed loss. We show that warming-up greatly improves recurrent neural network performance on long-term memory benchmarks for multiple recurrent cell types, but can sometimes impede precision. We therefore introduce a parallel recurrent network structure with partial warm-up that is shown to greatly improve learning on long time-series while maintaining high levels of precision. This approach provides a general framework for improving learning abilities of any recurrent cell type when long-term memory is required.

翻译：已知,当时间依赖性变长时,培训经常性神经网络就难以进行经常性培训。因此,培训标准门式单元,如门式经常单元和长期短距记忆,在需要长期记忆的基准上,仍然是一项艰巨的任务。在这项工作中,我们提出一种一般办法,通过一个称为“暖化”的过程来启动任何经常性网络连接,以提高其学习任意长期依赖性的能力。这个初始化过程旨在尽可能扩大网络可达到的多稳定度,即通过相关输入轨迹可以达到的吸引者的数量。在培训前,使用随机梯度梯度脱落,在特定的损失上进行警告。我们表明,变暖能极大地提高多个经常性细胞类型长期记忆基准上的经常性神经网络性能,但有时会妨碍精确性。因此,我们引入了一个同时同时提供部分热度的经常性网络结构,在保持高度精确性的同时,可以大大改进长期序列上的学习。这一方法提供了一个在需要长期记忆时提高任何经常性细胞类型的学习能力的一般框架。

0

相关内容

循环神经网络

循环神经网络

循环神经网络（RNN）是一类人工神经网络，其中节点之间的连接沿时间序列形成有向图。这使其表现出时间动态行为。 RNN源自前馈神经网络，可以使用其内部状态（内存）来处理可变长度的输入序列。这使得它们适用于诸如未分段的，连接的手写识别或语音识别之类的任务。

【Google】神经辐射场，Neural Radiance Fields，74页ppt

专知会员服务

74+阅读 · 2021年5月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

神经网络的元学习，综述论文，23页pdf，Meta-Learning in Neural Networks: A Survey

神经网络的元学习，综述论文，23页pdf，Meta-Learning in Neural Networks: A Survey

专知会员服务

84+阅读 · 2020年4月11日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec智能推荐

6+阅读 · 2019年3月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

【深度学习基础】4. Recurrent Neural Networks

【深度学习基础】4. Recurrent Neural Networks

微信AI

16+阅读 · 2017年7月19日

Transductive Maximum Margin Classifier for Few-Shot Learning

Arxiv

0+阅读 · 2021年7月26日

Wavelet Design in a Learning Framework

Arxiv

0+阅读 · 2021年7月23日

Imitation Learning: Progress, Taxonomies and Opportunities

Arxiv

12+阅读 · 2021年6月23日

Memory-Gated Recurrent Networks

Memory-Gated Recurrent Networks

Arxiv

12+阅读 · 2020年12月24日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

Prototype Rectification for Few-Shot Learning

Arxiv

4+阅读 · 2019年11月25日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

Reversible Recurrent Neural Networks

Arxiv

3+阅读 · 2018年10月25日

Learning to Guide Decoding for Image Captioning

Arxiv

6+阅读 · 2018年4月3日

Variational Recurrent Neural Machine Translation

Arxiv

5+阅读 · 2018年1月16日

VIP会员

文章信息

相关主题

循环神经网络

Neural Networks

相关VIP内容

【Google】神经辐射场，Neural Radiance Fields，74页ppt

专知会员服务

74+阅读 · 2021年5月28日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

神经网络的元学习，综述论文，23页pdf，Meta-Learning in Neural Networks: A Survey

神经网络的元学习，综述论文，23页pdf，Meta-Learning in Neural Networks: A Survey

专知会员服务

84+阅读 · 2020年4月11日

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

最大均方差正则化贝叶斯神经网络，Bayesian Neural Networks With Maximum Mean Discrepancy Regularization

专知会员服务

54+阅读 · 2020年3月5日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《复杂工程系统模型驱动设计决策支持系统：早期设计阶段挑战》最新138页

《日本陆上自卫队2040年作战方式与未来作战研究》最新23页slides

人工智能作为战争武器

《后勤保障》最新23页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec 精选：基于参数共享的CNN-RNN混合模型

LibRec智能推荐

6+阅读 · 2019年3月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

meta learning 17年：MAML SNAIL

meta learning 17年：MAML SNAIL

CreateAMind

11+阅读 · 2019年1月2日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

【深度学习基础】4. Recurrent Neural Networks

【深度学习基础】4. Recurrent Neural Networks

微信AI

16+阅读 · 2017年7月19日

相关论文

Transductive Maximum Margin Classifier for Few-Shot Learning

Arxiv

0+阅读 · 2021年7月26日

Wavelet Design in a Learning Framework

Arxiv

0+阅读 · 2021年7月23日

Imitation Learning: Progress, Taxonomies and Opportunities

Arxiv

12+阅读 · 2021年6月23日

Memory-Gated Recurrent Networks

Memory-Gated Recurrent Networks

Arxiv

12+阅读 · 2020年12月24日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

Prototype Rectification for Few-Shot Learning

Arxiv

4+阅读 · 2019年11月25日

Learning to Walk via Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年12月26日

Reversible Recurrent Neural Networks

Arxiv

3+阅读 · 2018年10月25日

Learning to Guide Decoding for Image Captioning

Arxiv

6+阅读 · 2018年4月3日

Variational Recurrent Neural Machine Translation

Arxiv

5+阅读 · 2018年1月16日

微信扫码咨询专知VIP会员