Recurrent neural networks are key tools for sequential data processing. However, they are notorious for problems regarding their training. Challenges include capturing complex relations between consecutive states and stability and efficiency of training. In this paper, we introduce a recurrent neural architecture called Deep Memory Update (DMU), as it is based on updating the previous memory state with a deep transformation of the lagged state and the network input. The architecture is able to learn the transformation of its internal state using an arbitrary nonlinear function. Its training is stable and relatively fast due to the speed of training varying according to a layer depth. Even though DMU is based on simple components, experimental results presented here confirm that it can compete with and often outperform state-of-the-art architectures such as Long Short-Term Memory, Gated Recurrent Units, and Recurrent Highway Networks.
翻译:经常性神经网络是连续处理数据的关键工具。 但是,经常神经网络因其培训问题而臭名昭著。 挑战包括掌握连续各州之间的复杂关系以及培训的稳定性和效率。 在本文中,我们引入了一个名为“深记忆更新”的经常性神经结构,因为它以更新以前的记忆状态为基础,对滞后状态和网络输入进行了深刻的改造。 结构能够使用任意的非线性功能学习内部状态的转型。 其培训是稳定且相对快速的, 因为培训速度因层次深度不同而不同。 尽管 DMU基于简单的组件, 但这里提出的实验结果证实它能够与长期短期记忆、Gated 常规单元和经常性高速公路网络等最先进的结构进行竞争, 并且往往超越了这种结构。