Recurrent neural networks have shown remarkable success in modeling sequences. However low resource situations still adversely affect the generalizability of these models. We introduce a new family of models, called Lattice Recurrent Units (LRU), to address the challenge of learning deep multi-layer recurrent models with limited resources. LRU models achieve this goal by creating distinct (but coupled) flow of information inside the units: a first flow along time dimension and a second flow along depth dimension. It also offers a symmetry in how information can flow horizontally and vertically. We analyze the effects of decoupling three different components of our LRU model: Reset Gate, Update Gate and Projected State. We evaluate this family on new LRU models on computational convergence rates and statistical efficiency. Our experiments are performed on four publicly-available datasets, comparing with Grid-LSTM and Recurrent Highway networks. Our results show that LRU has better empirical computational convergence rates and statistical efficiency values, along with learning more accurate language models.
翻译:经常性神经网络在建模序列方面表现出了显著的成功。然而,低资源状况仍然对这些模型的通用性产生了不利影响。我们引入了一套新的模型,称为Lattice 经常单元(LRU),以应对以有限资源学习深多层重复模型的挑战。LRU模型通过在单元内部创建不同(但同时)的信息流来实现这一目标:第一个时间流和第二个深度流。它还提供了一个信息横向和纵向流动的对称。我们分析了我们LRU模型的三个不同组成部分(重置门、更新门和预测状态)的脱钩效应。我们对新的LRU模型在计算趋同率和统计效率方面进行了评估。我们的实验是在四个公开可用的数据集上进行的,与Grid-LSTM和经常性高速公路网络进行比较。我们的结果显示,LRU拥有更好的实验性计算汇合率和统计效率值,同时学习更准确的语言模型。