Recurrent neural networks are important tools for sequential data processing. However, they are notorious for problems regarding their training. Challenges include capturing complex relations between consecutive states and stability and efficiency of training. In this paper, we introduce a recurrent neural architecture called Deep Memory Update (DMU). It is based on updating the previous memory state with a deep transformation of the lagged state and the network input. The architecture is able to learn to transform its internal state using any nonlinear function. Its training is stable and fast due to relating its learning rate to the size of the module. Even though DMU is based on standard components, experimental results presented here confirm that it can compete with and often outperform state-of-the-art architectures such as Long Short-Term Memory, Gated Recurrent Units, and Recurrent Highway Networks.
翻译:经常性神经网络是连续处理数据的重要工具。 但是,经常神经网络因其培训问题而臭名昭著。挑战包括掌握连续各州之间的复杂关系以及培训的稳定性和效率。在本文中,我们引入了一个名为深记忆更新(DMU)的经常性神经结构。它以更新以前的记忆状态为基础,对滞后状态和网络输入进行深刻的改造。这个结构能够学会使用任何非线性功能来改变其内部状态。它的培训是稳定而快速的,因为它的学习速度与模块的大小相关。尽管DMU是以标准组成部分为基础,但这里提出的实验结果证实它能够与长期短期记忆、Ged 常规单元和常规公路网等最先进的结构竞争,而且往往超过最先进的结构。