The concepts of unitary evolution matrices and associative memory have boosted the field of Recurrent Neural Networks (RNN) to state-of-the-art performance in a variety of sequential tasks. However, RNN still have a limited capacity to manipulate long-term memory. To bypass this weakness the most successful applications of RNN use external techniques such as attention mechanisms. In this paper we propose a novel RNN model that unifies the state-of-the-art approaches: Rotational Unit of Memory (RUM). The core of RUM is its rotational operation, which is, naturally, a unitary matrix, providing architectures with the power to learn long-term dependencies by overcoming the vanishing and exploding gradients problem. Moreover, the rotational unit also serves as associative memory. We evaluate our model on synthetic memorization, question answering and language modeling tasks. RUM learns the Copying Memory task completely and improves the state-of-the-art result in the Recall task. RUM's performance in the bAbI Question Answering task is comparable to that of models with attention mechanism. We also improve the state-of-the-art result to 1.189 bits-per-character (BPC) loss in the Character Level Penn Treebank (PTB) task, which is to signify the applications of RUM to real-world sequential data. The universality of our construction, at the core of RNN, establishes RUM as a promising approach to language modeling, speech recognition and machine translation.
翻译:统一进化矩阵和连带记忆的概念将常规神经网络(RNN)的功能提升到常规神经网络(RNN)的领域,使其在各种相继任务中达到最先进的性能。然而,RNN仍然有有限的能力来操纵长期记忆。为绕过这一弱点,RNN的最成功应用使用外部技术,如关注机制等。在本文件中,我们提出了一个新的RNNN模式,统一了最先进的方法:记忆旋转单元(RUM)。RUM的核心是其旋转操作,它自然是一个统一的矩阵,通过克服逐渐消失和爆炸的梯度问题,为学习长期依赖性学习的架构提供力量。此外,轮用单位也起到连带记忆的作用。我们评价了我们的合成记忆、问题回答和语言模型模型模型,彻底地学习了记忆任务,改进了RMUM(RMMU)的状态和状态。RUMU的功能在B解答任务中的表现与RPR-189号模型的实际模型的模型的翻版,我们把RPRPRB的正级数据转换到RPL的模型的版本。