如何在混乱数据方面训练区域网民? (How to train RNNs on chaotic data?) - 专知论文

会员服务 ·

0

RNN · 梯度消失问题 · Guidance · 循环神经网络 · SimPLe ·

2021 年 10 月 14 日

How to train RNNs on chaotic data?

翻译：如何在混乱数据方面训练区域网民?

Zahra Monfared,Jonas M. Mikhaeil,Daniel Durstewitz

Recurrent neural networks (RNNs) are wide-spread machine learning tools for modeling sequential and time series data. They are notoriously hard to train because their loss gradients backpropagated in time tend to saturate or diverge during training. This is known as the exploding and vanishing gradient problem. Previous solutions to this issue either built on rather complicated, purpose-engineered architectures with gated memory buffers, or - more recently - imposed constraints that ensure convergence to a fixed point or restrict (the eigenspectrum of) the recurrence matrix. Such constraints, however, convey severe limitations on the expressivity of the RNN. Essential intrinsic dynamics such as multistability or chaos are disabled. This is inherently at disaccord with the chaotic nature of many, if not most, time series encountered in nature and society. Here we offer a comprehensive theoretical treatment of this problem by relating the loss gradients during RNN training to the Lyapunov spectrum of RNN-generated orbits. We mathematically prove that RNNs producing stable equilibrium or cyclic behavior have bounded gradients, whereas the gradients of RNNs with chaotic dynamics always diverge. Based on these analyses and insights, we offer an effective yet simple training technique for chaotic data and guidance on how to choose relevant hyperparameters according to the Lyapunov spectrum.

翻译：经常性神经网络(RNNs)是用于模拟序列和时间序列数据的宽广的机器学习工具,是用于模拟序列和时间序列数据的宽广的机器学习工具。它们非常难以训练,因为其丢失的梯度在时间上反反射,在培训期间往往会饱和或差异。这被称为爆炸和消失的梯度问题。以前,这一问题的解决办法要么建立在相当复杂、目的设计的结构上,并配有封闭的内存缓冲,或者更近一些的附加限制,以确保与固定点的趋同或限制复发矩阵。然而,这些制约因素对RNN的表达性造成了严重限制。基本内在的动态,如多重性或混乱,已被禁用。这与许多(如果不是大部分的话)自然和社会上遇到的时间序列的混乱性质有矛盾。我们在这里通过将RNNE培训期间的损失梯度与RNNN的Lyapunov轨道频谱联系起来,或者更近一些限制(等)复发式矩阵。我们数学地证明,产生稳定平衡或周期行为的RNNNWs的梯度梯度梯度是封闭的梯度梯度,而选择多或混乱性等基本内在动态或混乱性动态等基本动态的梯度的梯度则总是提供这些相关的精确度分析,从而提供这些高度分析。

0

相关内容

RNN

RNN:循环神经网络，是深度学习的一种模型。

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【Strata Data Conference】用于自然语言处理的深度学习方法

【Strata Data Conference】用于自然语言处理的深度学习方法

专知会员服务

49+阅读 · 2019年9月23日

计算机 | 中低难度国际会议信息8条

计算机 | 中低难度国际会议信息8条

Call4Papers

9+阅读 · 2019年6月19日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

专知

8+阅读 · 2018年11月2日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

Adaptive Group Lasso Neural Network Models for Functions of Few Variables and Time-Dependent Data

Arxiv

0+阅读 · 2021年12月3日

Deep Neural Networks and Tabular Data: A Survey

Arxiv

9+阅读 · 2021年10月5日

Graph Neural Networks Inspired by Classical Iterative Algorithms

Graph Neural Networks Inspired by Classical Iterative Algorithms

Arxiv

4+阅读 · 2021年3月10日

Unbalanced minibatch Optimal Transport; applications to Domain Adaptation

Arxiv

3+阅读 · 2021年3月5日

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks

Arxiv

5+阅读 · 2021年2月21日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

Time-Series Event Prediction with Evolutionary State Graph

Arxiv

14+阅读 · 2020年11月25日

How to train your MAML

Arxiv

26+阅读 · 2019年3月5日

A Study on Overfitting in Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年4月20日

Tracking Noisy Targets: A Review of Recent Object Tracking Approaches

Arxiv

9+阅读 · 2018年2月14日

VIP会员

文章信息

相关主题

梯度消失问题

循环神经网络

相关VIP内容

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

【Strata Data Conference】用于自然语言处理的深度学习方法

【Strata Data Conference】用于自然语言处理的深度学习方法

专知会员服务

49+阅读 · 2019年9月23日

热门VIP内容

开通专知VIP会员享更多权益服务

【新书】面向企业的图学习扩展：生产级图学习与推理，485页pdf

AI智能体编程：技术、挑战与机遇综述

【国家标准】数据安全技术数据安全风险评估方法

【CMU博士论文】交互式学习的进展：替代性反馈机制与自适应因果推理

相关资讯

计算机 | 中低难度国际会议信息8条

计算机 | 中低难度国际会议信息8条

Call4Papers

9+阅读 · 2019年6月19日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

用RNN构建文本生成器（TensorFlow Eager+ tf.keras）

专知

8+阅读 · 2018年11月2日

利用动态深度学习预测金融时间序列基于Python

利用动态深度学习预测金融时间序列基于Python

量化投资与机器学习

18+阅读 · 2018年10月30日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Adaptive Group Lasso Neural Network Models for Functions of Few Variables and Time-Dependent Data

Arxiv

0+阅读 · 2021年12月3日

Deep Neural Networks and Tabular Data: A Survey

Arxiv

9+阅读 · 2021年10月5日

Graph Neural Networks Inspired by Classical Iterative Algorithms

Graph Neural Networks Inspired by Classical Iterative Algorithms

Arxiv

4+阅读 · 2021年3月10日

Unbalanced minibatch Optimal Transport; applications to Domain Adaptation

Arxiv

3+阅读 · 2021年3月5日

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks

Arxiv

5+阅读 · 2021年2月21日

GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training

Arxiv

14+阅读 · 2021年2月16日

Time-Series Event Prediction with Evolutionary State Graph

Arxiv

14+阅读 · 2020年11月25日

How to train your MAML

Arxiv

26+阅读 · 2019年3月5日

A Study on Overfitting in Deep Reinforcement Learning

Arxiv

7+阅读 · 2018年4月20日

Tracking Noisy Targets: A Review of Recent Object Tracking Approaches

Arxiv

9+阅读 · 2018年2月14日

微信扫码咨询专知VIP会员